mercredi 6 mai 2015

ROI-based KLT optical tracker in opencv

How can i add roi-based selection in lkdemo.pp( klt optical flow tracker opencv example) source code? I want select roi in the first frame and track feature point that selected in roi.

sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss

#include "opencv2/video/tracking.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"

#include <iostream>
#include <ctype.h>

using namespace cv;
using namespace std;

static void help()
{
    // print a welcome message, and the OpenCV version
    cout << "\nThis is a demo of Lukas-Kanade optical flow lkdemo(),\n"
            "Using OpenCV version " << CV_VERSION << endl;

}

Point2f point;
bool addRemovePt = false;

static void onMouse( int event, int x, int y, int , void* )
{
    if( event == CV_EVENT_LBUTTONDOWN )
    {
        point = Point2f((float)x, (float)y);
        addRemovePt = true;
    }
}

int main( int argc, char** argv )
{
    help();

    VideoCapture cap(CV_CAP_ANY);
    TermCriteria termcrit(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS, 20, 0.03);
    Size subPixWinSize(10,10), winSize(61,61);

    const int MAX_COUNT = 500;
    bool needToInit = false;
    bool nightMode = false;

    //if( argc == 1 || (argc == 2 && strlen(argv[1]) == 1 && isdigit(argv[1][0])))
        //cap.open(argc == 2 ? argv[1][0] - '0' : 0);
    //else if( argc == 2 )
        //cap.open(argv[1]);

    if( !cap.isOpened() )
    {
        cout << "Could not initialize capturing...\n";
        return 0;
    }

    namedWindow( "LK Demo", 1 );
    setMouseCallback( "LK Demo", onMouse, 0 );

    Mat gray, prevGray, image;
    vector<Point2f> points[2];

    for(;;)
    {
        Mat frame;
        cap >> frame;
        if( frame.empty() )
            break;

        frame.copyTo(image);
        cvtColor(image, gray, COLOR_RGB2GRAY);

        if( nightMode )
            image = Scalar::all(0);

        if( needToInit )
        {
            // automatic initialization
            goodFeaturesToTrack(gray, points[1], MAX_COUNT, 0.01, 10, Mat(), 3, 0, 0.04);
            cornerSubPix(gray, points[1], subPixWinSize, Size(-1,-1), termcrit);
            addRemovePt = false;
        }
        else if( !points[0].empty() )
        {
            vector<uchar> status;
            vector<float> err;
            if(prevGray.empty())
                gray.copyTo(prevGray);
            calcOpticalFlowPyrLK(prevGray, gray, points[0], points[1], status, err, winSize,10, termcrit, 0, 0.001);
            size_t i, k;
            for( i = k = 0; i < points[1].size(); i++ )
            {
                if( addRemovePt )
                {
                    if( norm(point - points[1][i]) <= 5 )
                    {
                        addRemovePt = false;
                        continue;
                    }
                }

                if( !status[i] )
                    continue;

                points[1][k++] = points[1][i];
                circle( image, points[1][i], 3, Scalar(0,255,0), -1, 8);
            }
            points[1].resize(k);
        }

        if( addRemovePt && points[1].size() < (size_t)MAX_COUNT )
        {
            vector<Point2f> tmp;
            tmp.push_back(point);
            cornerSubPix( gray, tmp, winSize, cvSize(-1,-1), termcrit);
            points[1].push_back(tmp[0]);
            addRemovePt = false;
        }

        needToInit = false;
        imshow("LK Demo", image);

        char c = (char)waitKey(10);
        if( c == 27 )
            break;
        switch( c )
        {
        case 'r':
            needToInit = true;
            break;
        case 'c':
            points[0].clear();
            points[1].clear();
            break;
        case 'n':
            nightMode = !nightMode;
            break;
        }

        std::swap(points[1], points[0]);
        cv::swap(prevGray, gray);
    }

    return 0;
}

Counting the basic operations of a given program

I am looking at the following: Operations Counting Example

Which is supposed to present the operations count of the following pseudocode:

Algorithm prefixAverages(A)
 Input array A of n numbers
 Output array B of n numbers such that B[i] is the average
 of elements A[0], A[1], … , A[i]

for i = 0 to n - 1 do
   b = 0
   for j = 0 to i do
       b = b + A[j]
       j++;
   B[i] = b / (i + 1)
return B

But I don't see how the counts on the inner for loop are reached. It says that for case i=0; j=0; the inner for loop runs twice? But it strikes me that it should only run once to see that 0 < 0. Can anyone provide insight into where the given operations count comes from or provide their own operations count?

This is under the assumption that primitive operations are:

  • Assignment
  • Array access
  • Mathematical operators (+, -, /, *)
  • Comparison
  • Increment/Decrement (math in disguise)
  • Return statements

Let me know if anything is unclear or you need more information

White Flicker after CombineRgn function

It seems the flickering is generated by the CombineRgn function, but I really have no idea why this happens, since i've never used regions that much I'm possibly missing some knowledge on the matter.

Some events in the program triggers the addition of little rectangles to the main region, here's the code that handles that:

        HRGN ActualRegion = CreateRectRgn(0, 0, 0, 0);
        GetWindowRgn(hwnd, ActualRegion);
        HRGN AddedRect = CreateRectRgn(//long code that creates a rectangle)
        CombineRgn(ActualRegion, ActualRegion, AddedRect, RGN_OR);

        SetWindowRgn(hwnd, ActualRegion, FALSE);
        InvalidateRect(hwnd, NULL, FALSE);

White Flickering appears only after the invalidation if new regions where combined to the main one.

Here's how I'm implementing double buffering in WM_PAINT:

PLEASE NOTE that on creation i'm enabling the DWM blur behind function with an invalid region (different from the Main one) which means that everything painted with BLACK_BRUSH will result in a 100% "invisible" portion of the program

        RECT r; GetClientRect(hwnd, &r);

        PAINTSTRUCT ps; HDC hdc = BeginPaint(hwnd, &ps);
        HDC MemDc = CreateCompatibleDC(hdc);
        HBITMAP hBmp = CreateCompatibleBitmap(hdc, r.right, r.bottom);
        HBITMAP hOld = (HBITMAP)SelectObject(MemDc, hBmp);

        //Making sure this dc is filled with "invisible" pixels to display
        SelectObject(MemDc, GetStockObject(BLACK_BRUSH));
        Rectangle(MemDc, //arbitrary values that matches the entire screen);

        BitBlt(hdc, 0, 0, GetSystemMetrics(SM_CXSCREEN), GetSystemMetrics(SM_CYSCREEN), MemDc, 0, 0,        SRCCOPY);

        //clean-up
        SelectObject(MemDc, hOld);
        DeleteObject(hBmp);
        DeleteDC(MemDc);
        EndPaint(hwnd, &ps);

WM_ERASEBKGND obviously returns TRUE without further handling, the WNDCLASSEX instance of the window has a default BLACK_BRUSH as the hbrBackground field.

I also tried to intercept and return TRUE from WM_NCPAINT message.

I'm doing everything necessary to avoid intermediate drawcalls, everything handled inside the WM_PAINT uses a backbuffer, also i'd like to mention i'm not working with images/bitmaps. Everything is drawn with gdi/gdi+, and in no place i'm actually issuing a "white" redraw that may possibly cause said flicker. I'm a bit lost here

Is it something that i'm possibly missing ? I can't really understand what may be causing white flickering in this scenario

Can not declare more than 1.5Million threads in C++

I have run into an issue where I can't declare more than 1.5 million threads. My code compiles fine, but terminates immediately, as soon as I change 1.5 to 1.6 Million. Here is a part of code that is giving me grief.

#include <thread>
#include <mutex>

void run_parallel(arg1, arg2)
{
....
}

int main(int argc, char *argv[]) {

thread t[9000000];
int x =0;
for(int i=0; i< 3000; i++)
    for (int j=0; j<3000; j++)
    {
    t[x] = thread(run_parallel, arg,arg2)
    t[x].join();
    x++;
    }
}

As you can see, I clearly need 9,000,000 threads to run this, but it is not executing. Any help would be mi=uch appreciated.

Is my normal interpolation perspectively correct

I am trying to implement a software renderer

It looks like this, it seems my interpolated normal is not perspectively correct enter image description here

I use scanline conversion and calculate normal with following steps:

Assume we are now drawing line AB (A B have same y value in screen space)

  1. Calculating normal of B by interpolating normals of Top vertex and bottom vertex. (Alpha and Beta value is retrieve from top and bottom in screen space)

  2. to calculate A is similar

  3. draw line AB. Calculating the normals of fragments by interpolating normals of A and B

  4. calc light contribution

Sorry for my bad english, hope the picture helps

enter image description here

If I am doing wrong, how to do correct interpolation?

Return type of decltype when applied to ternary(?:) expression

When I look into a code snippet for a possible implementation of std::common_type

template <class ...T> struct common_type;

template <class T>
struct common_type<T> {
    typedef decay_t<T> type;
};

template <class T, class U>
struct common_type<T, U> {
    typedef decay_t<decltype(true ? declval<T>() : declval<U>())> type;
};

template <class T, class U, class... V>
struct common_type<T, U, V...> {
    typedef common_type_t<common_type_t<T, U>, V...> type;
};

The part how to get a common type for two template argument makes me confused.It is a usage of ternary operator with decltype.

As I known, whether to return the second or third operand is decided by the value of first operand. In this snippet, the first operand is true which means the return value of expression will always be declval<T>(). If it is what i thought which make no sense... Therefore, I have tried the following test

int iii = 2;
float fff = 3.3;
std::cout << typeid(decltype(false? std::move(iii):std::move(fff))).name() << std::endl;
std::cout << typeid(decltype(std::move(iii))).name() << std::endl;
std::cout << typeid(decltype(false ? iii : fff)).name() << std::endl;
std::cout << typeid(decltype(true ? iii : fff)).name() << std::endl;

// [02:23:37][ryu@C++_test]$ g++ -std=c++14 -g common_type.cpp
// output 
// f
// i
// f
// f

Comparing with the running result, The result what i though should be like as follows

int iii = 2;
float fff = 3.3;
std::cout << typeid(decltype(false ? iii : fff)).name() << std::endl; // should return i;
std::cout << typeid(decltype(true ? iii : fff)).name() << std::endl;  // should return f;

Anyone when can help to explain why the running result is different ?

In other words, what's the return result of decltype when it is applied on a ternary expression?

Ambiguous call to abs

I have a custom data type that in practice can be either float or double. On every OS except OSX, I am able to successfully build this C++11 template:

template< class REAL_T >
inline REAL_T inhouse_abs( REAL_T i_val )
{
    return (REAL_T)std::abs( (REAL_T)i_val );
}

However, clang 6.0 (3.5 LLVM) reports an ambiguous function call. If I change abs to fabs, the error is resolved on OSX, but now an identical error shows up on my Linux clang, gcc, and Visual Studio.

Any ideas?

How can i edit a Qt designer form Class from a QGraphicsProxyWidget?

Who wants to make my life easier today?

Ι have written an App with Qt and now i am training how to use QGraphicsProxyWidgets in a scene. So i have a QGraphicsview on my MainWindow and this view has a scene in background. Now i try to use a QGraphicsProxyWidget class to display a mini window in the scene.

I made a new Qt designer Form Class and i designed a GUI with a QLabel and a Qpushbutton.

I use the below code to add this Form into the scene :

proxy = new CustomProxy(0, Qt::Widget);
proxy->setWidget(new Form);
scene->addItem(proxy);

The result is a "mini" Widget in the scene. The Question is this: Now i want to edit the QLabel from the Form class and i want to display there some live data, for example the coordinates of the mouse on the scene, But i cannot find a way to have access to the objects from the Form Class. Like

ui->QLableName->SetText("Hallo World");

and i ask the community how can i change the objects of the Form Class via a QGraphicsProxyWidget ?

File Handling in C reading multiple chars

abort action                        islemi durdur(MS)
abort sequence                      durdurma dizisi(IBM)

I have a file.txt like above. I want to read this from the file.txt separately. Besides the file.txt I got 2 more turkce.txt and ingilizce.txt

Here is what I want to do :

I want to read from file.txt and separate the words English and Turkish. After that ingilizce.txt become like this

abort action
abort sequence

and turkce.txt like this

islemi durdur(MS)
durdurma dizisi(IBM)

Thank you for your answers.

Should I Program to an Interface or an Abstract Base Class? What exactly does that phrase mean?

In object oriented programming, I have read that you should program to an interface not an implementation but do they mean literal interfaces (no shared code at all)?

Is it okay to program to an abstract base class that would have been an interface except that there were variables in this "interface" that all sub-classes were expected to have? Replicating a variable across sub-classes would have been an inconvenience because if I changed the name of one of the variables in one of the sub-classes I would have to change the name of that variable in all of the sub-classes.

In following the principle of "program to an interface not an implementation", is this okay or would you create another interface on top of the abstract base class and program to that interface?

C: Error in Using: "Compound Assignment" and "Prefix Decrement" together

Can someone please tell me why the C compiler outputs an error while using a Compound Assignment and a Prefix Dec/Inc together ? [ but C++ does not ]

int myVar = 5;
(--myVar) -= 4;
// C  : error C2106: '-=' : left operand must be l-value
// C++: myVar=0;

I know what the error says ...

But, I can't understand why a C compiler can't recognize myVar as a l-value but C++ does?!

Branchless K-means (or other optimizations)

I'm afraid this question might border dangerously close to the 'someone gimme solution plz' category, so if it's a poor question, please let me know and I'll delete it or try to do better. Actually I'd appreciate more of a guide to how to approach and come up with these kinds of solutions rather than the solution itself.

That aside, I have a very performance-critical function in my system showing up as a number one profiling hotspot in specific contexts. It's in the middle of a k-means iteration (already multithreaded using a parallel for processing sub-ranges of points in each worker thread).

ClusterPoint& pt = points[j];
pt.min_index = -1;
pt.min_dist = numeric_limits<float>::max();
for (int i=0; i < num_centroids; ++i)
{
    const ClusterCentroid& cent = centroids[i];
    const float dist = ...;
    if (dist < pt.min_dist) // <-- #1 hotspot
    {
        pt.min_dist = dist;
        pt.min_index = i;
    }
}

Any savings in the time required to process this section of code counts substantially, so I've often been fiddling with it a lot. It might be worth putting the centroid loop outside, for example, and iterate through the points in parallel for a given centroid. The number of cluster points here spans in the millions, while the number of centroids spans in the thousands. The algorithm is applied for a handful of iterations (often under 10). It doesn't seek perfect convergence/stability, just some 'reasonable' approximation.

Any ideas are appreciated, but what I'm really eager to discover is if this code can be made branchless as it would allow for a SIMD version. I haven't really developed the kind of mental ability to easily grasp how to come up with branchless solutions: my brain fails there much like it did when I was first exposed to recursion in the early days, so a guide on how to write branchless code and how to develop the appropriate mindset for it would also be helpful.

In short, I'm looking for any guides and hints and suggestions (not necessarily solutions) on how to micro-optimize this code. It most likely has room for algorithmic improvements, but my blindspot has always been in micro-optimization solutions (and I'm curious to learn how to apply them more effectively without going overboard with it). It's already tightly multithreaded with chunky parallel for logic, so I'm pretty much pushed into the micro-optimization corner as one of the quicker things to try without a smarter algorithm outright. We're completely free to change the memory layout.

In Response to Benjamin Gruenbaum

About looking at this all wrong in seeking to optimize an O(knm) algorithm which could clearly be improved at the algorithmic level, I wholeheartedly agree. This pushes this specific question into a somewhat academic and impractical realm. However, if I could be allowed an anecdote, I come from an original background of high-level programming -- big emphasis on broad, large-scale viewpoint, safety, and very little on the low-level implementation details. I've recently switched projects to a very different kind of modern-flavored one and I'm learning all kinds of new tricks from my peers of cache efficiency, GPGPU, branchless techniques, SIMD, special-purpose mem allocators that actually outperform malloc (but for specific scenarios), etc.

It's where I'm trying to catch up with the latest performance trends, and surprisingly I've found that those old data structures I often favored during the 90s which were often linked/tree-type structures are actually being vastly outperformed by much more naive, brutish, micro-optimized, parallelized code applying tuned instructions over contiguous memory blocks. It's somewhat disappointing at the same time since I feel like we're fitting the algorithms more to the machine now and narrowing the possibilities this way (especially with GPGPU).

The funniest thing is that I find this type of micro-optimized, fast array-processing code much easier to maintain than the sophisticated algorithms and data structures I was using before. For a start, they're easier to generalize. So I've been jumping on that micro-optimization bandwagon a bit more lately, and perhaps a little too much in this specific case, but my curiosity is more about expanding my range of possible solutions for any scenario, and hence the bigger focus on micro-optimization with this particular code.

Disassembly

One of the 'duh' kind of things I forgot to do was post the disassembly! I'm terribly sorry about that. I am really, really bad at assembly so I have often tuned things more in a trial and error kind of way, coming up with somewhat educated guesses about why a hotspot shown in vtune might be the bottleneck and then trying things out to see if the times improve, assuming the guesses have some hint of truth if the times do improve, or completely missed the mark if they don't.*

Disassembly:

000007FEEE3FB8A1  jl          thread_partition+70h (7FEEE3FB780h) 
    {
        ClusterPoint& pt = points[j];
        pt.min_index = -1;
        pt.min_dist = numeric_limits<float>::max();
        for (int i = 0; i < num_centroids; ++i)
000007FEEE3FB8A7  cmp         ecx,r10d 
000007FEEE3FB8AA  jge         thread_partition+1F4h (7FEEE3FB904h) 
000007FEEE3FB8AC  lea         rax,[rbx+rbx*2] 
000007FEEE3FB8B0  add         rax,rax 
000007FEEE3FB8B3  lea         r8,[rbp+rax*8+8] 
        {
            const ClusterCentroid& cent = centroids[i];
            const float x = pt.pos[0] - cent.pos[0];
            const float y = pt.pos[1] - cent.pos[1];
000007FEEE3FB8B8  movss       xmm0,dword ptr [rdx] 
            const float z = pt.pos[2] - cent.pos[2];
000007FEEE3FB8BC  movss       xmm2,dword ptr [rdx+4] 
000007FEEE3FB8C1  movss       xmm1,dword ptr [rdx-4] 
000007FEEE3FB8C6  subss       xmm2,dword ptr [r8] 
000007FEEE3FB8CB  subss       xmm0,dword ptr [r8-4] 
000007FEEE3FB8D1  subss       xmm1,dword ptr [r8-8] 
            const float dist = x*x + y*y + z*z;
000007FEEE3FB8D7  mulss       xmm2,xmm2 
000007FEEE3FB8DB  mulss       xmm0,xmm0 
000007FEEE3FB8DF  mulss       xmm1,xmm1 
000007FEEE3FB8E3  addss       xmm2,xmm0 
000007FEEE3FB8E7  addss       xmm2,xmm1 

            if (dist < pt.min_dist)
// VTUNE HOTSPOT
000007FEEE3FB8EB  comiss      xmm2,dword ptr [rdx-8] 
000007FEEE3FB8EF  jae         thread_partition+1E9h (7FEEE3FB8F9h) 
            {
                pt.min_dist = dist;
000007FEEE3FB8F1  movss       dword ptr [rdx-8],xmm2 
                pt.min_index = i;
000007FEEE3FB8F6  mov         dword ptr [rdx-10h],ecx 
000007FEEE3FB8F9  inc         ecx  
000007FEEE3FB8FB  add         r8,30h 
000007FEEE3FB8FF  cmp         ecx,r10d 
000007FEEE3FB902  jl          thread_partition+1A8h (7FEEE3FB8B8h) 
    for (int j = *irange.first; j < *irange.last; ++j)
000007FEEE3FB904  inc         edi  
000007FEEE3FB906  add         rdx,20h 
000007FEEE3FB90A  cmp         edi,dword ptr [rsi+4] 
000007FEEE3FB90D  jl          thread_partition+31h (7FEEE3FB741h) 
000007FEEE3FB913  mov         rbx,qword ptr [irange] 
            }
        }
    }
}

We're forced into targeting SSE 2 -- a bit behind on our times, but the user base actually tripped up once when we assumed that even SSE 4 was okay as a min requirement (the user had some prototype Intel machine).

Update with Standalone Test: ~5.6 secs

I'm very appreciative of all the help being offered! Because the codebase is quite extensive and the conditions for triggering that code are complex (system events triggered across multiple threads), it's a bit unwieldy to make experimental changes and profile them each time. So I've set up a superficial test on the side as a standalone application that others can also run and try out so that I can experiment with all these graciously offered solutions.

#define _SECURE_SCL 0
#include <iostream>
#include <fstream>
#include <vector>
#include <limits>
#include <ctime>
#if defined(_MSC_VER)
    #define ALIGN16 __declspec(align(16))
#else
    #include <malloc.h>
    #define ALIGN16 __attribute__((aligned(16)))
#endif

using namespace std;

// Aligned memory allocation (for SIMD).
static void* malloc16(size_t amount)
{
    #ifdef _MSC_VER
        return _aligned_malloc(amount, 16);
    #else
        void* mem = 0;
        posix_memalign(&mem, 16, amount);
        return mem;
    #endif
}
template <class T>
static T* malloc16_t(size_t num_elements)
{
    return static_cast<T*>(malloc16(num_elements * sizeof(T)));
}

// Aligned free.
static void free16(void* mem)
{
    #ifdef _MSC_VER
        return _aligned_free(mem);
    #else
        free(mem);
    #endif
}

// Test parameters.
enum {num_centroids = 512};
enum {num_points = num_centroids * 2000};
enum {num_iterations = 5};
static const float range = 10.0f;

class Points
{
public:
    Points(): data(malloc16_t<Point>(num_points))
    {
        for (int p=0; p < num_points; ++p)
        {
            const float xyz[3] =
            {
                range * static_cast<float>(rand()) / RAND_MAX,
                range * static_cast<float>(rand()) / RAND_MAX,
                range * static_cast<float>(rand()) / RAND_MAX
            };
            init(p, xyz);
        }
    }
    ~Points()
    {
        free16(data);
    }
    void init(int n, const float* xyz)
    {
        data[n].centroid = -1;
        data[n].xyz[0] = xyz[0];
        data[n].xyz[1] = xyz[1];
        data[n].xyz[2] = xyz[2];
    }
    void associate(int n, int new_centroid)
    {
        data[n].centroid = new_centroid;
    }
    int centroid(int n) const
    {
        return data[n].centroid;
    }
    float* operator[](int n)
    {
        return data[n].xyz;
    }

private:
    Points(const Points&);
    Points& operator=(const Points&);
    struct Point
    {
        int centroid;
        float xyz[3];
    };
    Point* data;
};

class Centroids
{
public:
    Centroids(Points& points): data(malloc16_t<Centroid>(num_centroids))
    {
        // Naive initial selection algorithm, but outside the 
        // current area of interest.
        for (int c=0; c < num_centroids; ++c)
            init(c, points[c]);
    }
    ~Centroids()
    {
        free16(data);
    }
    void init(int n, const float* xyz)
    {
        data[n].count = 0;
        data[n].xyz[0] = xyz[0];
        data[n].xyz[1] = xyz[1];
        data[n].xyz[2] = xyz[2];
    }
    void reset(int n)
    {
        data[n].count = 0;
        data[n].xyz[0] = 0.0f;
        data[n].xyz[1] = 0.0f;
        data[n].xyz[2] = 0.0f;
    }
    void sum(int n, const float* pt_xyz)
    {
        data[n].xyz[0] += pt_xyz[0];
        data[n].xyz[1] += pt_xyz[1];
        data[n].xyz[2] += pt_xyz[2];
        ++data[n].count;
    }
    void average(int n)
    {
        if (data[n].count > 0)
        {
            const float inv_count = 1.0f / data[n].count;
            data[n].xyz[0] *= inv_count;
            data[n].xyz[1] *= inv_count;
            data[n].xyz[2] *= inv_count;
        }
    }
    float* operator[](int n)
    {
        return data[n].xyz;
    }
    int find_nearest(const float* pt_xyz) const
    {
        float min_dist_squared = numeric_limits<float>::max();
        int min_centroid = -1;
        for (int c=0; c < num_centroids; ++c)
        {
            const float* cen_xyz = data[c].xyz;
            const float x = pt_xyz[0] - cen_xyz[0];
            const float y = pt_xyz[1] - cen_xyz[1];
            const float z = pt_xyz[2] - cen_xyz[2];
            const float dist_squared = x*x + y*y * z*z;

            if (min_dist_squared > dist_squared)
            {
                min_dist_squared = dist_squared;
                min_centroid = c;
            }
        }
        return min_centroid;
    }

private:
    Centroids(const Centroids&);
    Centroids& operator=(const Centroids&);
    struct Centroid
    {
        int count;
        float xyz[3];
    };
    Centroid* data;
};

// A high-precision real timer would be nice, but we lack C++11 and
// the coarseness of the testing here should allow this to suffice.
static double sys_time()
{
    return static_cast<double>(clock()) / CLOCKS_PER_SEC;
}

static void k_means(Points& points, Centroids& centroids)
{
    // Find the closest centroid for each point.
    for (int p=0; p < num_points; ++p)
    {
        const float* pt_xyz = points[p];
        points.associate(p, centroids.find_nearest(pt_xyz));
    }

    // Reset the data of each centroid.
    for (int c=0; c < num_centroids; ++c)
        centroids.reset(c);

    // Compute new position sum of each centroid.
    for (int p=0; p < num_points; ++p)
        centroids.sum(points.centroid(p), points[p]);

    // Compute average position of each centroid.
    for (int c=0; c < num_centroids; ++c)
        centroids.average(c);
}

int main()
{
    Points points;
    Centroids centroids(points);

    cout << "Starting simulation..." << endl;
    double start_time = sys_time();
    for (int i=0; i < num_iterations; ++i)
        k_means(points, centroids);
    cout << "Time passed: " << (sys_time() - start_time) << " secs" << endl;
    cout << "# Points: " << num_points << endl;
    cout << "# Centroids: " << num_centroids << endl;

    // Write the centroids to a file to give us some crude verification
    // of consistency as we make changes.
    ofstream out("centroids.txt");
    for (int c=0; c < num_centroids; ++c)
        out << "Centroid " << c << ": " << centroids[c][0] << "," << centroids[c][1] << "," << centroids[c][2] << endl;
}

I am aware of the dangers of superficial testing but since it's already deemed to be a hotspot from previous real-world sessions, I hope it's excusable. I'm also just interested in the general techniques associated with micro-optimizing such code.

I did get slightly different results in profiling this one. The times are a bit more evenly dispersed within the loop here, and I'm not sure why. Perhaps it's because the data is smaller (I omitted members and hoisted out the min_dist member and made it a local variable). The exact ratio between centroids to points is also a bit different, but hopefully close enough to translate improvements here to the original code. It's also single-threaded in this superficial test, and the disassembly looks quite different so I may be risking optimizing this superficial test without the original (a risk I'm willing to take for now, as I'm more interested in expanding my knowledge of techniques that could optimize these cases rather than a solution for this exact case).

VTune

Update with Yochai Timmer's Suggestion -- ~12.5 secs

Oh, I face the woes of micro-optimization without understanding assembly very well. I replaced this:

        -if (min_dist_squared > dist_squared)
        -{
        -    min_dist_squared = dist_squared;
        -    pt.centroid = c;
        -}

With this:

        +const bool found_closer = min_dist_squared > dist_squared;
        +pt.centroid = bitselect(found_closer, c, pt.centroid);
        +min_dist_squared = bitselect(found_closer, dist_squared, min_dist_squared);

.. only to find the times escalated from ~5.6 secs to ~12.5 secs. Nevertheless, that is not his fault nor does it take away from the value of his solution -- that's mine for failing to understand what's really going on at the machine level and taking stabs in the dark. That one apparently missed, and apparently I was not the victim of branch misprediction as I initially thought. Nevertheless, his proposed solution is a wonderful and generalized function to try in such cases, and I'm grateful to add it to my toolbox of tips and tricks. Now for round 2.

Harold's SIMD Solution - 2.496 secs (see caveat)

This solution might be amazing. After converting the cluster rep to SoA, I'm getting times of ~2.5 seconds with this one! Unfortunately, there appears to be a glitch of some sort. I'm getting very different results for the final output that suggests more than slight precision differences, including some centroids towards the end with values of 0 (implying that they were not found in the search). I've been trying to go through the SIMD logic with the debugger to see what might be up -- it could merely be a transcription error on my part, but here's the code in case someone could spot the error.

If the error could be corrected without slowing down the results, this speed improvement is more than I ever imagined from a pure micro-optimization!

    // New version of Centroids::find_nearest (from harold's solution):
    int find_nearest(const float* pt_xyz) const
    {
        __m128i min_index = _mm_set_epi32(3, 2, 1, 0);
        __m128 xdif = _mm_sub_ps(_mm_set1_ps(pt_xyz[0]), _mm_load_ps(cen_x));
        __m128 ydif = _mm_sub_ps(_mm_set1_ps(pt_xyz[1]), _mm_load_ps(cen_y));
        __m128 zdif = _mm_sub_ps(_mm_set1_ps(pt_xyz[2]), _mm_load_ps(cen_z));
        __m128 min_dist = _mm_add_ps(_mm_add_ps(_mm_mul_ps(xdif, xdif), 
                                                _mm_mul_ps(ydif, ydif)), 
                                                _mm_mul_ps(zdif, zdif));
        __m128i index = min_index;
        for (int i=4; i < num_centroids; i += 4) 
        {
            xdif = _mm_sub_ps(_mm_set1_ps(pt_xyz[0]), _mm_load_ps(cen_x + i));
            ydif = _mm_sub_ps(_mm_set1_ps(pt_xyz[1]), _mm_load_ps(cen_y + i));
            zdif = _mm_sub_ps(_mm_set1_ps(pt_xyz[2]), _mm_load_ps(cen_z + i));
            __m128 dist = _mm_add_ps(_mm_add_ps(_mm_mul_ps(xdif, xdif), 
                                                _mm_mul_ps(ydif, ydif)), 
                                                _mm_mul_ps(zdif, zdif));
            __m128i mask = _mm_castps_si128(_mm_cmplt_ps(dist, min_dist));
            min_dist = _mm_min_ps(min_dist, dist);
            min_index = _mm_or_si128(_mm_and_si128(index, mask), 
                                     _mm_andnot_si128(mask, min_index));
            index = _mm_add_epi32(index, _mm_set1_epi32(4));
        }

        ALIGN16 float mdist[4];
        ALIGN16 uint32_t mindex[4];
        _mm_store_ps(mdist, min_dist);
        _mm_store_si128((__m128i*)mindex, min_index);

        float closest = mdist[0];
        int closest_i = mindex[0];
        for (int i=1; i < 4; i++)
        {
            if (mdist[i] < closest) 
            {
                closest = mdist[i];
                closest_i = mindex[i];
            }
        }
        return closest_i;
    }

Harold's SIMD Solution (Corrected) - ~2.5 secs

After applying the corrections and testing them out, the results are intact and function correctly with similar improvements to the original codebase!

Since this hits the holy grail of knowledge I was seeking to understand better (branchless SIMD), I'm going to award the solution with some extra props for more than doubling the speed of the operation. I have my homework cut out in trying to understand it, since my goal was not merely to mitigate this hotspot, but to expand on my personal understanding of possible solutions to deal with them.

Nevertheless, I'm grateful for all the contributions here from the algorithmic suggestions to the really cool bitselect trick! I wish I could accept all the answers. I may end up trying all of them at some point, but for now I have my homework cut out in understanding some of these non-arithmetical SIMD ops.

int find_nearest_simd(const float* pt_xyz) const
{
    __m128i min_index = _mm_set_epi32(3, 2, 1, 0);
    __m128 pt_xxxx = _mm_set1_ps(pt_xyz[0]);
    __m128 pt_yyyy = _mm_set1_ps(pt_xyz[1]);
    __m128 pt_zzzz = _mm_set1_ps(pt_xyz[2]);

    __m128 xdif = _mm_sub_ps(pt_xxxx, _mm_load_ps(cen_x));
    __m128 ydif = _mm_sub_ps(pt_yyyy, _mm_load_ps(cen_y));
    __m128 zdif = _mm_sub_ps(pt_zzzz, _mm_load_ps(cen_z));
    __m128 min_dist = _mm_add_ps(_mm_add_ps(_mm_mul_ps(xdif, xdif), 
                                            _mm_mul_ps(ydif, ydif)), 
                                            _mm_mul_ps(zdif, zdif));
    __m128i index = min_index;
    for (int i=4; i < num_centroids; i += 4) 
    {
        xdif = _mm_sub_ps(pt_xxxx, _mm_load_ps(cen_x + i));
        ydif = _mm_sub_ps(pt_yyyy, _mm_load_ps(cen_y + i));
        zdif = _mm_sub_ps(pt_zzzz, _mm_load_ps(cen_z + i));
        __m128 dist = _mm_add_ps(_mm_add_ps(_mm_mul_ps(xdif, xdif), 
                                            _mm_mul_ps(ydif, ydif)), 
                                            _mm_mul_ps(zdif, zdif));
        index = _mm_add_epi32(index, _mm_set1_epi32(4));
        __m128i mask = _mm_castps_si128(_mm_cmplt_ps(dist, min_dist));
        min_dist = _mm_min_ps(min_dist, dist);
        min_index = _mm_or_si128(_mm_and_si128(index, mask), 
                                 _mm_andnot_si128(mask, min_index));
    }

    ALIGN16 float mdist[4];
    ALIGN16 uint32_t mindex[4];
    _mm_store_ps(mdist, min_dist);
    _mm_store_si128((__m128i*)mindex, min_index);

    float closest = mdist[0];
    int closest_i = mindex[0];
    for (int i=1; i < 4; i++)
    {
        if (mdist[i] < closest) 
        {
            closest = mdist[i];
            closest_i = mindex[i];
        }
    }
    return closest_i;
}

Accessing Qt widget thread safe

I use a QPlainTextEdit to display some text. This text will be modified (append) in another thread than the ui(main) thread with sending a signal to the widget

connect(this, SIGNAL(addText(QString)), mUi->plainTextEditLog, SLOT(appendPlainText(QString)));

...
emit addText(QString::fromStdString(someString));
...

another thread is reading the text of this PlainTextEdit and writes it to a file

QFile file(fileName);
if (!file.open(QIODevice::WriteOnly | QIODevice::Text | QIODevice::Truncate)) {
    return;
}
file.write(mUi->plainTextEditLog->toPlainText().toUtf8());
file.close();

As far as i know the Qt widgets aren't thread safe. I thought about a mutex to lock the writing signal emit, but this wont lock it really because it is only sending a signal asynchronously.

The reason why i use signals is that the writing method can be called from more than one thread and a mutex didn't help in this case but signals do perfectly.

The second thought was to store the text also in my class and lock the string with mutual exclusion. But i am not sure if this is very efficient because there is not only the plaintextedit that has to be modified but also the string as a copy.

std::async using an rvalue reference bound to a lambda

I'm trying to bind an rvalue reference to a lambda using std::bind, but I have issues when I throw that into a std::async call: (source)

auto lambda = [] (std::string&& message) {
    std::cout << message << std::endl;
};
auto bound = std::bind(lambda, std::string{"hello world"});
auto future = std::async(bound); // Compiler error here
future.get()

This issues a compiler error I'm not really sure how to interpret:

error: no type named 'type' in 'class std::result_of(std::basic_string)>&()>'

What's going on here? Interestingly, a slight modification does compile and work as expected. If I change std::string{"hello world"} to a c-string literal, everything works fine: (source)

auto lambda = [] (std::string&& message) {
    std::cout << message << std::endl;
};
auto bound = std::bind(lambda, "hello world");
auto future = std::async(bound);
future.get(); // Prints "hello world" as expected

Why does this work but not the first example?

CreateFile returning ERROR_ACCESS_DENIED for volumes

I am trying to open a volume using CreateFile. Code i am using is this.

CreateFile( TEXT("\\\\.\\c:"), 
    GENERIC_READ | GENERIC_WRITE, 
    FILE_SHARE_READ | FILE_SHARE_WRITE,
    NULL,
    OPEN_EXISTING,
    0,
    NULL);

I took this sample from this link. Walk through NTFS journal windows. It was failing for my code earlier. So i tried above snippet from above link. This too failed with same error. I read about it a lot but could not get any solution. So can someone please guide on how to get a volume handle correctly.

Custom QSslSocket for QNetworkAccessManager

In my project I need to use a specific version of OpenSSL. I'm using both Qt 4.8.6 and Qt 5.4.0. I'd like to create a custom QSslSocket to be passed to QNetworkAccessManager, which will be used for a QWebView.

I noticed that in Qt 4.8.6 only TLS 1.0 is supported, newer protocol versions aren't.

Is there a way to pass a subclassed QSslSocket (with a TLS 1.2 version) to QNetworkAccessManager in an easy way? Looking at the source code, it is hidden from public usage (QSslSocket is a friend of private implementation)?

Note: I don't want to use QHttp because it's not public anymore in newer Qt libraries, making it hard to be portable.

Edit: There's a similar question (QNetworkAccessManager/QNetworkReply with custom QTcpSocket?), made 5 years ago, but it still cannot be possible to modify the QSslSocket directly. The answer given back then is too generic

How to redirect all OpenGL output?

I wrote a minimal OpenGL application and linked the console to the project. OpenGL outputs its Version and things like that to the console.

The small OpenGL framework that I am writing will be used by an aplpication that features its own logging so i want to redirect all logging to there.

So far I tried googling the problem but I can not find all the information I need. There are debugging tools, but those are stand alone. I found logging options but it is not clear to me if this is meant to catch all messages or just some.

void glDebugMessageCallback​(DEBUGPROC callback​, void* userParam​);

Will registering with this function catch all messages or will I miss some of them. If so, how do I log everything?

Object listed in headerfile, yet undefined reference in cpp file

In the class, in my header file, in the private section of the class I have

 Heap_PriorityQueue<Passenger> oldBoardingQueue;

I have #include "Heap_PriorityQueue.h" in both the header and the cpp just to be safe. I haven't even started using it in my cpp file, yet when I try to compile the cpp file throws up a bunch of

undefiend reference to 'Heap_PriorityQueue<Passenger>::(insert function for Heap_PriorityQueue class isEmpty, add, etc.)

Followed by several

undefined reference to 'non-virtual thunk to Heap_PriorityQueue<Passenger>::(Heap_PriorityQueue functions again)

Unsure of how to proceed. Am I declaring incorrectly?

C++ Using an int to set an enum from a list

I am using a header file for character class in a game (doing it as a side project for experience). I have it working but it just feels like I'm doing this the long way. I ask for an 'int' to set the character class then set it based on the enum position using a switch statement. Is there a clearer, shorter way to do this operation? Am I doing anything here that would be considered bad practice?

class Character_Class {
  public:
    enum classofnpc { CLERIC, FIGHTER, ROGUE, WIZARD, BARBARIAN, DRUID, PALADIN, SORCERER, BARD, MONK, RANGER, WARLOCK };

    Character_Class(const int& a, const int& b){
        switch (a) {
            case 0 :
                a_class = CLERIC;
                break;
            case 1 :
                a_class = FIGHTER;
                break;
            case 2 :
                a_class = ROGUE;
                break;
            case 3 :
                a_class = WIZARD;
                break;
            case 4 :
                a_class = BARBARIAN;
                break;
            case 5 :
                a_class = DRUID;
                break;
            case 6 :
                a_class = PALADIN;
                break;
            case 7 :
                a_class = SORCERER;
                break;
            case 8 :
                a_class = BARD;
                break;
            case 9 :
                a_class = MONK;
                break;
            case 10 :
                a_class = RANGER;
                break;
            case 11 :
                a_class = WARLOCK;
                break;
        }
        lvl = b;
    }
  private:
    classofnpc a_class;
    int lvl;
};

Taking "exit" as user input in C++

I'm currently trying to write code that takes user input as strings, and then convert them to integers if I need to. If the user decides to enter exit then the program should move on to calling a function. Here is what I have so far:

void printArray(string string_array[], int size){
    for (int i = 0; i < size; i++){
        cout << string_array[i];
    }

}
void a_func(){
    string string_array[10];
    string user_input;

    while (user_input != "exit"){
        cout << "Please enter a number between 0 - 100: ";
        cin >> user_input;
        if (stoi(user_input) < 0 || stoi(user_input) > 100){
            cout << "Error, please re-enter the number between 0 - 100: ";
            cin >> user_input;
        }
        else if (user_input == "exit"){
            printArray(string_array);
        int array_index = stoi(user_input) / 10;
        string_array[array_index] = "*";

However, as I'm testing the program, the console is aborting the program if I enter exit. Is there a way for me to enter exit and then the program calls printArray?

Can't use imshow in xcode 5 on OSX 10.8

I'm kind of puzzled as I cannot use imshow from the opencv library. I use plenty of other function from opencv, but I get this error, when I want to show my matrix/image.

Undefined symbols for architecture x86_64:
  "cv::namedWindow(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int)", referenced from:
      _main in main.o
  "cv::imshow(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, cv::_InputArray const&)", referenced from:
      _main in main.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

The the codelines/functions throwing this error are:

cvStartWindowThread();
namedWindow("DisplayImage", WINDOW_AUTOSIZE);
imshow("Display Image", img_bgr);

where img_bgr is my desired matrix.

I'm working on OSX 10.8.5 with Xcode 5.1.1 and opencv-2.4.10 was installed via this walkthrough.

Anyone has an idea and can help why I can't call those above functions? I searched for threads concerning this problem, but none seemed to be satisfying.

EDIT:

Have uninstalled and installed OpenCV twice now. Once with Cmake itself (as application) and once with help of this tutorial. Nothing worked out still get the same error. Does no one have an idea?

mysterious rtm abort using haswell tsx

I'm experimenting with the tsx extensions in haswell, by adapting an existing medium-sized (1000's of lines) codebase to using GCC transactional memory extensions (which indirectly are using haswell tsx in this machine) instead of coarse grained locks.

I'm having issues getting it to work fast enough because I get high rates of hardware transaction abort for mysterious reasons. As shown below, these aborts are not due to conflicts nor due to capacity limitations.

Here is the perf command I used to quantify the failure rate and underlying causes:

perf stat \
 -e cpu/event=0x54,umask=0x2,name=tx_mem_abort_capacity_write/ \
 -e cpu/event=0x54,umask=0x1,name=tx_mem_abort_conflict/ \
 -e cpu/event=0x5d,umask=0x1,name=tx_exec_misc1/ \
 -e cpu/event=0x5d,umask=0x2,name=tx_exec_misc2/ \
 -e cpu/event=0x5d,umask=0x4,name=tx_exec_misc3/ \
 -e cpu/event=0x5d,umask=0x8,name=tx_exec_misc4/ \
 -e cpu/event=0x5d,umask=0x10,name=tx_exec_misc5/ \
 -e cpu/event=0xc9,umask=0x1,name=rtm_retired_start/ \
 -e cpu/event=0xc9,umask=0x2,name=rtm_retired_commit/ \
 -e cpu/event=0xc9,umask=0x4,name=rtm_retired_aborted/pp \
 -e cpu/event=0xc9,umask=0x8,name=rtm_retired_aborted_misc1/ \
 -e cpu/event=0xc9,umask=0x10,name=rtm_retired_aborted_misc2/ \
 -e cpu/event=0xc9,umask=0x20,name=rtm_retired_aborted_misc3/ \
 -e cpu/event=0xc9,umask=0x40,name=rtm_retired_aborted_misc4/ \
 -e cpu/event=0xc9,umask=0x80,name=rtm_retired_aborted_misc5/ \ 
./myprogram -th 1 -reps 3000000

So, the program runs some code with transactions in it 30 million times. Each request involves one transaction gcc __transaction_atomic block. There is only one thread in this run.

This particular perf command captures most of the relevant tsx performance events described in the Intel software developers manual vol 3.

The output from perf stat is the following:

             0 tx_mem_abort_capacity_write                                  [26.66%]
             0 tx_mem_abort_conflict                                        [26.65%]
    29,937,894 tx_exec_misc1                                                [26.71%]
             0 tx_exec_misc2                                                [26.74%]
             0 tx_exec_misc3                                                [26.80%]
             0 tx_exec_misc4                                                [26.92%]
             0 tx_exec_misc5                                                [26.83%]
    29,906,632 rtm_retired_start                                            [26.79%]
             0 rtm_retired_commit                                           [26.70%]
    29,985,423 rtm_retired_aborted                                          [26.66%]
             0 rtm_retired_aborted_misc1                                    [26.75%]
             0 rtm_retired_aborted_misc2                                    [26.73%]
    29,927,923 rtm_retired_aborted_misc3                                    [26.71%]
             0 rtm_retired_aborted_misc4                                    [26.69%]
           176 rtm_retired_aborted_misc5                                    [26.67%]

  10.583607595 seconds time elapsed

As you can see from the output:

  • The rtm_retired_start count is 30 million (matches input to program)
  • The rtm_retired_abort count is about the same (no commits at all)
  • The abort_conflict and abort_capacity counts are 0, so these are not the reasons. Also, recall it is only one thread running, conflicts should be rare.
  • The only actual leads here are the high values of tx_exec_misc1 and rtm_retired_aborted_misc3, which are somewhat similar in description.

The Intel manual (vol 3) defines rtm_retired_aborted_misc3 counters:

code: C9H 20H

mnemonic: RTM_RETIRED.ABORTED_MISC3

description: Number of times an RTM execution aborted due to HLE unfriendly instructions.

The definition for tx_exec_misc1 has some similar words:

code: 5DH 01H

mnemonic: TX_EXEC.MISC1

description: Counts the number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort.

I checked the assembly location for the aborts using perf record/ perf report using high precision (PEBS) support for rtm_retired_aborted. The location has a mov instruction from register to register. No weird instruction names seen nearby.

Could someone clarify / help me understanding of the causes for misc3 aborts (and misc1 tx_exec) in general? Perhaps someone else is running into this problem as well?

some clarifying notes:

  • I am using GCC's transactional_memory extensions, not writing my own _xbegin / _xend directly. I am using the ITM_DEFAULT_METHOD=htm

  • Sadly, this behavior is somewhat erratic. At some point, in one version of the code, it would happen when compiled with -O1 but not with -O2 nor with -O0. Later on, it also started happening with O2 when I made some various changes to the code for unrelated reasons. In all these cases, the 'culprit' instructions blamed by perf with PEBS enabled were not obviously wrong (mov register to register, push to the stack). And there were no obvious bad instructions nearby either.

  • I'm going to try reproing this problem in a smaller codebase, but even now, sometimes the hardware success rate does go up to 99.9% due to seemingly minor changes. This means it is hard to keep shaving off code and keep the thing happening, but it is also hard to understand the difference between the changes.

Using colon (':') to access elements in an array in C++ (in Rcpp)

I am trying to run the following code. Frankly I know C++ only little but I want to get the following function run. Can you help me run this silly example?

cppFunction(
  'NumericVector abc(int x, int x_end, NumericVector y)
  {
    NumericVector z;
    int x1 = x + x_end;
    z = y[x:x1];
    return(z);  
   }'
)

abc(3,c(0,1,10,100,1000,10000))

I see this ...

error: expected ']' before ':' token

Update Sorry I forgot to mention that I need to generate a sequence of numbers from x to x1. The function IntegerVector::create only creates a variable with only x and x1 not x though x1. The example I gave was trivial. I updated the example now. I need to do in C++ what seq() does in R

Using a resource file in QMediaPlayer

How do I load a .mp3 file to use in a QMediaPlayer from a .qrc resource file?

This is what I have so far

QMediaPlayer *player = new QMediaPlayer;
player->setMedia(QUrl::fromLocalFile(":/audio/theme.mp3"));
player->play();

resources.qrc:

<RCC>
    <qresource prefix="/audio">
        <file>theme.mp3</file>
    </qresource>
</RCC>

theme.mp3 is located in the project directory.

Coloring the entire background of an MFC static label

This Answer is really great if you want to change the background color of a "conventional" text label. But what if you want to put a border around that text label and expand its size so that the text is swimming in a veritable sea of color? It only paints the text background in the required color, and leaves the rest of the expanded control with the standard button face. How can one make the color consistent across the entire control?

Note: The attractive feature (to me anyway) about the above answer is that it makes use of OnCtlColor(), which provides a pointer to the CWnd control concerned. So there is no need to create a subclass of CStatic to handle the color change. An answer that avoids creating such a subclass would be preferred.

Which dummy pointer values are okay

This is something that's been on my mind for a long time. Every once so often i see people use 1, or -1 as a dummy value for a pointer. To safe the need of a different variable.

Is it okay to use pointers like this? I suppose -1 is okay. But is it also okay to use other negative numbers? And can i be sure 1 isn't a used memory location? And if so, up to which value can i use memory pointers.

P.s. I'm aware storing other data especially positive values in pointers, is bad practice. But I'm just wondering excatly how bad it is.

incompatible pointer types initializing initializing a basic trie node

I know C is very finnicky about file level initialization. Or rather I just don't know what constant expression means yet.

What I want to do is initialize a node (aka struct node) with all null pointers.

//Trie node definition
typedef struct node{
    bool is_word;
    struct node* next[27]; //27 for the valid number of chars

}node;


struct node* empties[27];
node empty = {.is_word = 0, .next = empties};


dictionary.c:24:33: error: incompatible pointer types initializing 'struct node *' with an
      expression of type 'struct node *[27]' [-Werror,-Wincompatible-pointer-types]
node empty = {.is_word=0,.next =empties};
                                ^~~~~~~
dictionary.c:24:33: error: suggest braces around initialization of subobject
      [-Werror,-Wmissing-braces]
node empty = {.is_word=0,.next =empties};

I'm getting an error when I try to initialize this. I would also try manually initializing the members but 27 indexes makes that very tedious. Is there a way to loop initialize at the file level?

Lua - set a function from a class

I am currently trying to learn how to bridge Lua with C++ and write scripts. The problem that I have is that most of the tutorial or documentation is somewhat simplified. Or maybe I have the wrong approach, which is why I am asking here now.

I am trying to initialize the meta table and more specifically. Trying to add functions. Like so;

luaL_Reg rmFuncs[] =
    {
        { "rotate", RotateBlocks },
        { NULL, NULL }
    };

But to add RotateBlocks is a bit tricky. All examples that I've found is done in main and not in classes. RotateBlocks is actually Application::RotateBlocks.

So the only way for me to make it work is like this:

static int RotateBlocks(lua_State* L);

Is this the only way? Because this causes a lot of problems. As it is static, I cant actually rotate the objects that I want inside the function. Because all my membervalues complain on the function being static.

So how do I actually do anything useful inside my functions and not just printf a random sentence as in all the tutorials or documentation?

How to display highlight of CDockablePane when dragging in MFC

I create a CMDI project with MainFrame and ChildFrame. Then, I add a derived class called CThumbnailPane inherited from CDockablePane inside CChildFrame (it is mandatory)

BOOL CChildFrame::CreateThumbnailPane(void)
{
    SChildFrame* member = (SChildFrame*)m;
    ASSERT(member);

    CRect r(0, 0, 150, 150);
    if (!member->m_NoteThumbnailPane.CreateEx(NULL, _T("ThumbnailPane"), this, r, TRUE, VIEWID_NOTE_THUMBNAIL_PANE, 
        WS_CHILD | WS_CLIPCHILDREN | CBRS_LEFT | CBRS_RIGHT))
    {
        return FALSE;
    }

    member->m_NoteThumbnailPane.SetMinSize(CSize(100, 100));
    member->m_NoteThumbnailPane.EnableDocking(CBRS_ALIGN_ANY);

    //show at first
    ShowToolPane(TRUE,TRUE);

    return TRUE;
}

void CChildFrame::ShowToolPane(BOOL bShow, BOOL bFirst)
{
    SChildFrame* member = (SChildFrame*)m;
    ASSERT(member);

    if (member->m_NoteThumbnailPane.GetSafeHwnd() == NULL)
        return;

    if (bShow)
    {
        if (member->isFirstDisplayThumbnail == TRUE)
        {
            member->m_NoteThumbnailPane.ShowPane(TRUE, FALSE, TRUE);
            member->m_NoteThumbnailPane.DockToFrameWindow(CBRS_ALIGN_BOTTOM);
            member->isFirstDisplayThumbnail = FALSE;
        }
        else
        {
            member->m_NoteThumbnailPane.DockToRecentPos();
            member->m_NoteThumbnailPane.ShowPane(TRUE, FALSE, TRUE);
        }
    }
    else
    {
        member->m_NoteThumbnailPane.ShowPane(FALSE, FALSE, FALSE);      
    }   
}

In order to adjust the layout of the pane, I use CSmartDockingManager:

{
    CSmartDockingManager* sdManager = m_pDockManager->GetSmartDockingManager();
    sdManager->OnPosChange();
    AFXGetParentFrame(this)->RecalcLayout();
}

=>Everything goes well BUT only 4 individual direction buttons is not ok because It does not highlight the pane location when I drag the pane such as the center group buttons

Could you guide me how to display highlight of the pane when dragging OR hide 4 individual direction buttons ?

intialization of structs/classes without constructors in stack vs heap

I would like to know the rule for zeroing-out structs (or classes) that have no default constructor in C++.

In particular, it seems that if stored in the stack (say, as a local variable) they are uninitialized, but if allocated on the heap, they are zero-initialized (tested with GCC 4.9.1). Is this guaranteed to be portable?

Example program:

#include <iostream>
#include <map>
using namespace std;

struct X {
    int i, j, k;
    void show() { cout << i << " " << j << " " << k << endl; }
};

int fib(int i) {
    return (i > 1) ? fib(i-1) + fib(i-2) : 1;
}

int main() {
    map<int, X> m;            
    fib(10);                  // fills the stack with cruft
    X x1;                     // local
    X &x2 = m[1];             // heap-allocated within map
    X *x3 = new X();          // explicitly heap-allocated
    x1.show();  // --> outputs whatever was on the heap in those positions
    x2.show();  // --> outputs 0 0 0 
    x3->show(); // --> outputs 0 0 0     
    return 0;
}


Edited: removed an "or should I just use a constructor" in the bolded part; because what made me ask is that I want to know if it is guaranteed behaviour or not - we can all agree that readable code is better of with explicit constructors.

Output the DLL instead of .exe with cmake/mingw

I'm not very familiar with both cmake and mingw.

I have some source code that can be build with it (and the build process works fine with mingw32-make). The problem is, I would like to output the DLLs and not the .exe files.

I have some CMakeList file that I believe stores the configuration I have to change (that part should be responsible to produce the .exe files for .cpp files in Examples/ directory, taking some dependencies into account):

# C examples
if(PSMOVE_BUILD_EXAMPLES)
    foreach(EXAMPLE example multiple dump_calibration battery_check)
        add_executable(${EXAMPLE} examples/c/${EXAMPLE}.c)
        target_link_libraries(${EXAMPLE} psmoveapi)
    endforeach()

    if(PSMOVE_BUILD_TRACKER AND PSMOVE_BUILD_TUIO_SERVER)
        include_directories(${PSMOVEAPI_SOURCE_DIR}/external/TUIO_CPP/TUIO)
        include_directories(${PSMOVEAPI_SOURCE_DIR}/external/TUIO_CPP/oscpack)
        add_executable(tuio_server examples/c/tuio_server.cpp
            external/TUIO_CPP/TUIO/TuioClient.cpp
...
            external/TUIO_CPP/oscpack/ip/win32/NetworkingUtils.cpp
            external/TUIO_CPP/oscpack/ip/win32/UdpSocket.cpp)
        set_target_properties(tuio_server PROPERTIES
            COMPILE_FLAGS -DOSC_HOST_LITTLE_ENDIAN)
        target_link_libraries(tuio_server psmoveapi psmoveapi_tracker)
    else()
        # Disable the TUIO Server if we don't build the tracker
        set(PSMOVE_BUILD_TUIO_SERVER OFF)
    endif()

    if(PSMOVE_BUILD_TRACKER)
        foreach(EXAMPLE distance_calibration)
            add_executable(${EXAMPLE} examples/c/${EXAMPLE}.c)
            target_link_libraries(${EXAMPLE} psmoveapi psmoveapi_tracker)
        endforeach()
    endif()
endif()

I guess I should add -DBUILDING_EXAMPLE_DLL and -shared options somewhere. But where exactly? Or maybe I'm missing the point?

How to print/press '@' using keybd_event function?

To press 'a' code is - keybd_event(VkKeyScan(64),0,0,0); Releasing key code is - keybd_event(VkKeyScan(64),0,KEYEVENTF_KEYUP,0); For pressing '@' i need combination of two key - SHIFT & 2 , but i don't know how.

keybd_event (http://ift.tt/1P0PxQK)

Using a Variable of an Instance within the Method of another Class in C++

I was wondering how I would go about using a variable of a specific instance of a class within a function of another class.

To provide an example of what I'm trying to do, say I've 3 classes a,b and c. Class c inherits from class b, and a single instance of b and c are called within a method in class a and b respectively. How would I go about using the variable of int pos (see below) within a specific instance of class a in class c?

class a
{
    private:
    void B(); //Calls an instance of class c
    int pos; //Variable that I want to use in c
};

class b : public c
{
    private:
    void C(); //Calls an instance of class b
};

class c
{
    private:
    void calculate(int _pos); //Method which requires the value of pos from class a 
};

Help would be greatly appreciated, thank you!

Force function parameter to match some rule

Is there any way to "force" a function parameter to follow some rule in C++ ?
For the sake of example, let say I want to write a function which computes the n'th derivative of a mathematical function. Let suppose the signature of the function is this one :

double computeNthDerivative(double x, unsigned int n);

Now, let say I want to forbid users to input 0 for n. I could just use an assert or test the value and throw an exception if the user input is 0.
But is there any other way of doing this kind of stuff ?

Pointer against not pointer

I read in many places including Effective C++ that it is better to store data on the stack and not as pointer to the data.
I can understand why doing this with small object.
Because it is also reduced the number of new and delete calls which also reduced the chance to memory leak and also the pointer can take more space then the object itself.
But with large object that copying them will be expensive is it not better to store them in a smart pointer?
Because with many operation with the large object there will be few object copying which is very expensive (I am not including the geters and seters).

boost::program_options: option recognizes following option as an argument when no argument is provided

I have a program that takes a few options and I want to recognize when no argument is provided.

This is what happen when I call my program without one option arg

program -lib

cout: the required argument for option '-lib' is missing

That's ok, but when I call my program with additional options, e.g

program -lib -out number

The variable assigned to lib gets the value "-out", although -out was declared as an option. I expect to get the same warning as in the first example.

I can solve this problem by adding a custom notifier to all the options, code below:

void validate_string(const std::string& r)
{
    if (*r.begin() == '-') { 
            throw Something
    }
}

...

("lib", po::value<std::string>(&lib)->notifier(validate_string), "Library")

There's any way of doing this with a build-in mechanism of boost::program_options? I don't like my current solution, the options declaration look messy and is hard to read. Besides -out is not getting assigned.

BTW: I use allow_long_disguise, so single - is allowed for long options

Error in including MongoDB C++ driver in Ubuntu

I have been trying to connect C++ with MongoDB but it gave errors on many levels and now I'm stuck at compiling a simple code. I have followed this tutorial. And I tried to compile the code given in the tutorial using the command below.

g++ tutorial.cpp -Iinstall/include -Linstall/lib -pthread -lmongoclient -lboost_thread -lboost_filesystem -lboost_program_options -lboost_system -o tutorial

But it prints a huge log on the console and exits with a error. The final part output is below.

 nce to `boost::re_detail::put_mem_block(void*)'
install/lib/libmongoclient.a(dbclient.o): In function `perl_matcher':
/usr/include/boost/regex/v4/perl_matcher.hpp:374: undefined reference to `boost::re_detail::perl_matcher<__gnu_cxx::__normal_iterator<char const*, std::string>, std::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<char const*, std::string> > >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::construct_init(boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, boost::regex_constants::_match_flags)'
install/lib/libmongoclient.a(dbclient.o): In function `boost::re_detail::perl_matcher<__gnu_cxx::__normal_iterator<char const*, std::string>, std::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<char const*, std::string> > >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::match_match()':
/usr/include/boost/regex/v4/perl_matcher_non_recursive.hpp:973: undefined reference to `boost::match_results<__gnu_cxx::__normal_iterator<char const*, std::string>, std::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<char const*, std::string> > > >::maybe_assign(boost::match_results<__gnu_cxx::__normal_iterator<char const*, std::string>, std::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<char const*, std::string> > > > const&)'
collect2: error: ld returned 1 exit status

Can someone explain how to correct this issue?

type deduction from char array to std::string

I'm trying to write sum function using variadic template. In the code I would write something like

sum(1, 2., 3)

and it will return most general type of the sum (in this case - double). The problem is with characters. When I call it like

sum("Hello", " ", "World!") 

template parameters are deduces as const char [7] for example, thus it won't compile. I found the way to specify last argument as std::string("World!"), but it's not pretty, is there any way to achieve automatic type deduction to std::string or correctly overload sum?

The code I have so far:

template<typename T1, typename T2>
auto sum(const T1& t1, const T2& t2) {
    return t1 + t2;
}

template<typename T1, typename... T2>
auto sum(const T1& t1, const T2&... t2) {
    return t1 + sum(t2...);
}

int main(int argc, char** argv) {
    auto s1 = sum(1, 2, 3);
    std::cout << typeid(s1).name() << " " << s1 << std::endl;

    auto s2 = sum(1, 2.0, 3);
    std::cout << typeid(s2).name() << " " << s2 << std::endl;

    auto s3 = sum("Hello", " ", std::string("World!"));
    std::cout << typeid(s3).name() << " " << s3 << std::endl;

    /* Won't compile! */
    /*
    auto s4 = sum("Hello", " ", "World!");
    std::cout << typeid(s4).name() << " " << s4 << std::endl;
    */

    return 0;
}

Output:

i 6
d 6
Ss Hello World!

Best resources for c++11, c++14 [on hold]

I know my way around c++2003/98, want to learn new standard c++11, c++14, What are the best resources available on web to learn ? Resource needs be clearly written with code examples.

Date Struct is not properly being checked

I have some C++ code:

#include <bjarne/std_lib_facilities.h>

struct Date {
    int month;
    int day;
    int year;
};

Date get_date();
Date get_birth_date();
int days_in_month (int month);
bool is_valid_date (const Date& date);
bool is_before (const Date& date1, const Date& date2);

int main()
{
    cout << "Welcome to Age Calculator!\n"; 
    Date current;
    current = get_date();
    cout << "Would you like to see how old you are (y/n)?";
    char answer, slash;
    cin >> answer;
    Date birthday;
    if(answer == 'y'){
        birthday = get_birth_date();
        bool valid = is_valid_date (birthday);
        bool before = is_before (current,birthday);
        while(!valid && !before){
            cout << "Invalid birth date?  Please re-enter: ";
            cin >> birthday.month >> slash >> birthday.day >> slash >> birthday.year;
            valid = is_valid_date (birthday);
            before = is_before (current,birthday);
        }
        cout << "Your birthday is: " << birthday.month << "/" << birthday.day << "/" << birthday.year << "\n";
    }
    else
    cout << "You are so chicken! \n";    
}

Date get_date()
{
    cout << "Please enter today's date (mm/dd/yyyy): ";
    Date today; 
    char slash;
    cin >> today.month >> slash >> today.day >> slash >> today.year;
    bool valid = is_valid_date (today);
    while(!valid){
        cout << "Invalid date?  Please re-enter: ";
        cin >> today.month >> slash >> today.day >> slash >> today.year;
        valid = is_valid_date (today);
    }
    cout << "Date entered was: " << today.month << "/" << today.day << "/" << today.year << "\n";
    return today;
}

Date get_birth_date()
{
    cout << "Please enter your birth date (mm/dd/yyyy): ";
    Date birth; 
    char slash;
    cin >> birth.month >> slash >> birth.day >> slash >> birth.year;
    return birth;
}

int days_in_month (int month)
{
    int month31[7] = {1,3,5,7,8,10,12};
    for(int i = 0; i < 7; i++){
        if(month == month31[i])
            return 31;
    }
    int month30[4] = {4,6,9,11};
    for(int i = 0; i < 4; i++){
        if(month == month30[i])
            return 30;
    }
    if(month == 2)
        return 28;
}

bool is_valid_date (const Date& date)
{
    int months[12] = {1,2,3,4,5,6,7,8,9,10,11,12};
    int days = 0; 
    for(int i = 0; i < 12; i++){
        if(date.month == months[i]){
            days = days_in_month (date.month);
            if(date.day <= days && date.day > 1){
                return true;
            }

    }
    return false;
}

bool is_before (const Date& date1, const Date& date2)
{
    cout << date1.day << " " << date2.day; 
    if(date2.year < date1.year){
        return true;
    }
    else if(date2.year == date1.year)
    {
        cout << "-";
        if(date2.month < date1.month)
            return true;
        else if(date2.month == date1.month){
            if(date2.day <= date1.day)
                return true;
        }
        else
            return false;

    }
    return false;
}

I know that the is_valid_date function works, but when I'm testing a birthday that comes after the current day entered, for some reason, it passes the is_before test and never goes to the while loop asking the user to enter a valid birthday. Any suggestions would be greatly appreciated! Thanks in advance!

Edit: The specific inputs that I'm testing are: for today's date, I enter 05/06/2015 and for a birthday, I enter 05/07/2015. It then prints the birthday, which means it skips the while loop in int main(), which it shouldn't do, since the birthday comes after the current date.

Came up with an algorithm for sorting an array of large sized objects; can anyone tell me what this algorithm is called? (couldn't find it on Google)

I needed to sort an array of large sized objects and it got me thinking: could there be a way to minimize the number of swaps?

So I used quicksort (but any other fast sort should work here too) to sort indices to the elements in the array; indices are cheap to swap. Then I used those indices to swap the actual objects into their places. Unfortunately this uses O(n) additional space to store the indices. The code below illustrates the algorithm (which I'm calling IndexSort), and in my tests, appears to be faster than plain quicksort for arrays of large sized objects.

template <class Itr>
void IndexSort(Itr begin, Itr end)
{
    const size_t count = end - begin;

    // Create indices
    vector<size_t> ind(count);
    iota(ind.begin(), ind.end(), 0);

    // Sort indices
    sort(ind.begin(), ind.end(), [&begin] (const size_t i, const size_t j)
    {
        return begin[i] < begin[j];
    });

    // Create indices to indices. This provides
    // constant time search in the next step.
    vector<size_t> ind2(count);
    for(size_t i = 0; i < count; ++i)
        ind2[ind[i]] = i;

    // Swap the objects into their final places
    for(size_t i = 0; i < count; ++i)
    {
        if( ind[i] == i )
            continue;

        swap(begin[i], begin[ind[i]]);

        const size_t j = ind[i];

        swap(ind[i], ind[ind2[i]]);
        swap(ind2[i], ind2[j]);
    }
}

Now I have measured the swaps (of the large sized objects) done by both, quicksort, and IndexSort, and found that quicksort does a far greater number of swaps. So I know why IndexSort could be faster.

But can anyone with a more academic background explain why/how does this algorithm actually work? (it's not intuitive to me, although I somehow came up with it).

Thanks!

Edit: The following code was used to verify the results of IndexSort

// A class whose objects will be large
struct A
{
    int id;
    char data[1024];

    // Use the id to compare less than ordering (for simplicity)
    bool operator < (const A &other) const
    {
        return id < other.id;
    }

    // Copy assign all data from another object
    void operator = (const A &other)
    {
        memcpy(this, &other, sizeof(A));
    }
};

int main()
{
    const size_t arrSize = 1000000;

    // Create an array of objects to be sorted
    vector<A> randArray(arrSize);
    for( auto &item: randArray )
        item.id = rand();

    // arr1 will be sorted using quicksort
    vector<A> arr1(arrSize);
    copy(randArray.begin(), randArray.end(), arr1.begin());

    // arr2 will be sorted using IndexSort
    vector<A> arr2(arrSize);
    copy(randArray.begin(), randArray.end(), arr2.begin());

    {
        // Measure time for this
        sort(arr1.begin(), arr1.end());
    }

    {
        // Measure time for this
        IndexSort(arr2.begin(), arr2.end());
    }

    // Check if IndexSort yielded the same result as quicksort
    if( memcmp(arr1.data(), arr2.data(), sizeof(A) * arr1.size()) != 0 )
        cout << "sort failed" << endl;

    return 0;
}

Edit: Made the test less pathological; reduced the size of the large object class to just 1024 bytes (plus one int), and increased the number of objects to be sorted to one million. This still results in IndexSort being significantly faster than quicksort.

Edit: This requires more testing for sure. But it makes me think, what if std::sort could, at compile time, check the object size, and (depending on some size threshold) choose either the existing quicksort implemenation or this IndexSort implementation.

Also, IndexSort could be described as an "in-place tag sort" (see samgak's answer and my comments below).

does anyone know this mysterious operator ">?" in GCC [duplicate]

This question already has an answer here:

Does anyone know about >? operator? I have a macro with below definition which is throwing error, but I have never seen such an operator till now:

#define MAX_SIZEOF2(a,b)           (sizeof(a) >? sizeof(b))

Operations between types "struct _object*" and "int" is not allowed

The part of the code that is causing this error is:

newobj = STRINGLIB_NEW(NULL, STRINGLIB_LEN(self));

I have tried to undefine the variable and malloc. I am not sure how to correct the error. Also I am compiling this with the ILE C AS400 compiler What do I need to do to fix it? The following is the entire code.....

/* NOTE: this API is -ONLY- for use with single byte character strings. */
/* Do not use it with Unicode. */

#include "bytes_methods.h"
#include "structmember.h"
#include "stdlib.h"
#include "Python.h"
#include "stdio.h"
#include "string.h"


static PyObject*
stringlib_isspace(PyObject *self)
{
const char* sum = (const char*)malloc(STRINGLIB_STR(self));
    return _Py_bytes_isspace(sum, STRINGLIB_LEN(self));
}

static PyObject*
stringlib_isalpha(PyObject *self)
{
const char* sum = (const char*)malloc(STRINGLIB_STR(self));
    return _Py_bytes_isalpha(sum, STRINGLIB_LEN(self));
}

static PyObject*
stringlib_isalnum(PyObject *self)
{
const char* sum = (const char*)malloc(STRINGLIB_STR(self));
    return _Py_bytes_isalnum(sum, STRINGLIB_LEN(self));
}

static PyObject*
stringlib_isdigit(PyObject *self)
{
const char* sum = (const char*)malloc(STRINGLIB_STR(self));
    return _Py_bytes_isdigit(sum, STRINGLIB_LEN(self));
}

static PyObject*
stringlib_islower(PyObject *self)
{
const char* sum = (const char*)malloc(STRINGLIB_STR(self));
    return _Py_bytes_islower(sum, STRINGLIB_LEN(self));
}

static PyObject*
stringlib_isupper(PyObject *self)
{
const char* sum = (const char*)malloc(STRINGLIB_STR(self));
    return _Py_bytes_isupper(sum, STRINGLIB_LEN(self));
}

static PyObject*
stringlib_istitle(PyObject *self)
{
const char* sum = (const char*)malloc(STRINGLIB_STR(self));
    return _Py_bytes_istitle(sum, STRINGLIB_LEN(self));
}


/* functions that return a new object partially translated by ctype funcs: */

static PyObject*
stringlib_lower(PyObject *self)
{
char* sum1;
const char* sum = (const char*)malloc(STRINGLIB_STR(self));
    PyObject* newobj;
    newobj = STRINGLIB_NEW(NULL, STRINGLIB_LEN(self));
    sum1 = (char*)malloc(STRINGLIB_STR(newobj));
    if (!newobj)
            return NULL;
    _Py_bytes_lower(sum1, sum,
                 STRINGLIB_LEN(self));
    return newobj;
}

static PyObject*
stringlib_upper(PyObject *self)
{
char* sum1;
const char* sum = (const char*)malloc(STRINGLIB_STR(self));
    PyObject* newobj;
    newobj = STRINGLIB_NEW(NULL, STRINGLIB_LEN(self));
    sum1 = (char*)malloc(STRINGLIB_STR(newobj));
    if (!newobj)
            return NULL;
    _Py_bytes_upper(sum1, sum,
                 STRINGLIB_LEN(self));
    return newobj;
}

static PyObject*
stringlib_title(PyObject *self)
{
char* sum1;
char* sum = (char*)malloc(STRINGLIB_STR(self));
    PyObject* newobj;
    newobj = STRINGLIB_NEW(NULL, STRINGLIB_LEN(self));
    sum1 = (char*)realloc(sum,STRINGLIB_STR(newobj));
    if (!newobj)
            return NULL;
    _Py_bytes_title(sum1,sum,
                 STRINGLIB_LEN(self));
    return newobj;
}

static PyObject*
stringlib_capitalize(PyObject *self)
{
char* sum1;
char* sum = (char*)malloc(STRINGLIB_STR(self));
    PyObject* newobj;
    newobj = STRINGLIB_NEW(NULL, STRINGLIB_LEN(self));
    if (!newobj)
            return NULL;
    sum1 = (char*)realloc(sum,STRINGLIB_STR(newobj));
    _Py_bytes_capitalize(sum1, sum,
                      STRINGLIB_LEN(self));
    return newobj;
}

static PyObject*
stringlib_swapcase(PyObject *self)
{
char* sum1;
char* sum = (char *)malloc(STRINGLIB_STR(self));
    PyObject* newobj;
    newobj = STRINGLIB_NEW(NULL, STRINGLIB_LEN(self));
    sum1 = (char*)realloc(sum,STRINGLIB_STR(newobj));
    if (!newobj)
            return NULL;
    free (sum);
    _Py_bytes_swapcase(sum1,sum,
                    STRINGLIB_LEN(self));
    return newobj;
}

Can someone please clarify what the "<" does when comparing two strings c++ [duplicate]

This question already has an answer here:

I had a past paper question involving the STL sort algorithm. And it was sorting a vector, i am aware that the sort algorithm sorts into ascending order using the "<" operator. However, I was wondering how two strings are compared? How is a value for a string found? (So i can work out what string is smaller than another in an exam.)

even,odd,prime numbers

I found even,odd,prime numbers from the read file

int _tmain(int argc, _TCHAR* argv``[])
{
    ifstream read;
    read.open("input.txt");
    ofstream write;

    write.close();
    read.close();
    system("pause");
    return 0;
}

C++ Call by reference

If I have a function which takes a pointer to an integer, and I pass a reference to an integer variable from my main, is this call by value or call by reference? Sample code:

#include <iostream>

using namespace std;

void fun(int *a){
//Code block
}

int main(){
    int a = 5;
    fun(&a);
    return 0;
}

In the above code, is the call to function fun a call by value or call by reference?

C++ Extract number from the middle of a string

I have a vector containing strings that follow the format of text_number-number

Eg: Example_45-3

I only want the first number (45 in the example) and nothing else which I am able to do with my current code:

std::vector<std::string> imgNumStrVec;
for(size_t i = 0; i < StrVec.size(); i++){
    std::vector<std::string> seglist;
    std::stringstream ss(StrVec[i]);
    std::string seg, seg2;
    while(std::getline(ss, seg, '_')) seglist.push_back(seg);
    std::stringstream ss2(seglist[1]);
    std::getline(ss2, seg2, '-');
    imgNumStrVec.push_back(seg2); 
}

Are there more streamlined and simpler ways of doing this? and if so what are they?

I ask purely out of desire to learn how to code better as at the end of the day, the code above does successfully extract just the first number, but it seems long winded and round-about.

How to use one rendered sprite texture over and over

I'm creating a game which has sprites, at the start of the game before anything is shown all of the sprites which will be used will be loaded and a texture will be created for each of the sprites. My problem is that when I create an object I want to give it the same rendered texture as the texture it corresponds to. e.g. if the new object is a grass tile then give it the same texture as the already loaded grass tile texture.

Nothing goes wrong as such because everything works as intended, the problem is that the game uses way too much memory (around the 150 MB mark) because of the Tile::Draw function, anywhere before the Tile::draw function is called the game only uses around 10 MB (which is just the images which are loaded at the start). So my question is how do I reuse the same texture over and over without having to render it and still draw it to the window. I'll give some of my code for drawing.

Function which draws the sprites.

void Tile::Draw(){ //here is where each sprite is drawn 
    sprite->Begin(D3DXSPRITE_ALPHABLEND); 
    sprite->Draw(tex, NULL, NULL, &position, colour); //here is where more memory is used each time a sprite is drawn
    sprite->End();
}

Function which creates the sprite texture.

void Tile::CreateTexture(IDirect3DDevice9* device){
    D3DXCreateTextureFromFileEx(device, filePath, getDimensions().width, getDimensions().height, 1, D3DPOOL_DEFAULT,
    D3DFMT_UNKNOWN, D3DPOOL_MANAGED, D3DX_DEFAULT, D3DX_DEFAULT, 0, NULL, NULL, &tex);

    //Attempt to create the sprite
    D3DXCreateSprite(device, &sprite);
}

This is the class for each tile object which holds the sprite, texture, and position variables.

class Tile{
public:
    void setIndex(int x, int y) { position.x = x; position.y = y; position.z = 0; } //used for setting the position of the tile
    void setColour(D3DCOLOR colour) { this->colour = colour; } //used for setting the colour value of the object 
    void setTexture(LPDIRECT3DTEXTURE9 *tex) { this->tex = *tex; } //used for setting the texture value of the object
    void setSprite(LPD3DXSPRITE sprite) { this->sprite = sprite; } //used for setting the sprute value of the object
    void CreateTexture(IDirect3DDevice9* device); //used for creating the texture for the object

    LPDIRECT3DTEXTURE9 getTexture() { return tex; } //used for returning the texture of the object
    LPD3DXSPRITE getSprite() { return sprite; } //used for returning the sprite of the object

    void Draw(); //used for drawing the object

protected:
    D3DXVECTOR3 position; //used for storing the position of the object
    LPDIRECT3DTEXTURE9 tex; //used for storing the texture of the object
    LPD3DXSPRITE sprite; //used for storing the sprute value of the object
    D3DCOLOR colour; //used for storing the colour value of the object
};

When the tiles are first created at the start of the game they are stored in a vector array so they can be accessed later

    std::vector<Tile> tileList;

Here is what I would do when I want to create a new Tile object which will be drawn:

Tile tile; //create a new tile object
tile = tileList[0] //give the new object the exact same values of the already created tile which includes, sprite, texture, etc

I would then later call a draw function to draw all of the tiles to the window

void Map::draw(){
    tile.draw(); //call the tiles own draw function so it can be drawn
}

Any help would be greatly appreciated and the actual code even more so.

Thanks.

Embedding Cython in C++

I am trying to embed a piece of Cython code in a C++ project, such that I can compile a binary that has no dependencies on Python 2.7 (so users can run the executable without having Python installed). The Cython source is not pure Cython: There is also Python code in there.

I am compiling my Cython code using distutils in the following script (setup.py):

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("test.pyx")
)

I then run the script using python setup.py build_ext --inplace. This generates a couple of files: test.c, test.h, test.pyd and some library files: test.exp, test.obj and test.lib.

What would be the proper procedure to import this into C++? I managed to get it working by including test.c and test.h during compilation and test.lib during linking.

I am then able to call the Cython functions after I issue

Py_Initialize();
inittest();

in my C++ code.

The issue is that there a numerous dependencies on Python, both during compilation (e.g., in test.h) as well in during linking. Bottom-line is that in order to run the executable, Python has to be installed (otherwise I get errors on missing python27.dll).

Am I going in the right direction with this approach? There are so many options that I am just very confused on how to proceed. Conceptually, it also does not make sense why I should call Py_Initialize() if I want the whole thing to be Python-independent. Furthermore, this is apparently the `Very High Level Embedding' method instead a low-level Cython embedding, but this is just how I got it to work.

If anybody has any insights on this, that would be really appreciated.

"linker command failed with exit code 1 (use -v to see invocation)" on xcode

Here is the error report

Undefined symbols for architecture x86_64:
"RightTriangle::RightTriangle()", referenced from: _main in main.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation)

Right triangle is a class that I wrote that is defined in a header file

Edit: Figured it out, had a constructor with nothing in it which the program did not like (eg Rectangle();) fixed it by putting code in it