Optimizing gradient function in C++

Question

I'm computing the gradient of an image manually (without using built in functions) and I want to make it faster but keeping the same performance. I'm very open to any suggestions.. currently learning c++ optimzation and STL lib in recent c++.

Here is my code, my main concern is gradiantAndDirection function that I need to optimize. The TimerAvrg struct is how I compute the running time of that function:

#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <iostream>
#include <chrono>


struct   TimerAvrg
{
    std::vector<double> times;
    size_t curr=0,n;
    std::chrono::high_resolution_clock::time_point begin,end;
    TimerAvrg(int _n=30)
    {
        n=_n;
        times.reserve(n);
    }

    inline void start()
    {
        begin= std::chrono::high_resolution_clock::now();
    }

    inline void stop()
    {
        end= std::chrono::high_resolution_clock::now();
        double duration=double(std::chrono::duration_cast<std::chrono::microseconds>(end-begin).count())*1e-6;
        if ( times.size()<n)
            times.push_back(duration);
        else{
            times[curr]=duration;
            curr++;
            if (curr>=times.size()) curr=0;}
    }

    double getAvrg()
    {
        double sum=0;
        for(auto t:times)
            sum+=t;
        return sum/double(times.size());
    }
};

//Those variables will be in a class with gradiantAndDirection
uchar *dirImg;
int gradThresh = 20;

void gradiantAndDirection(cv::Mat& GxGy, const cv::Mat &grey)
{
    cv::Mat smoothImage;
    GaussianBlur(grey, smoothImage, cv::Size(5, 5), 1.0);
    uchar *smoothImg = smoothImage.data;

    GxGy.create( grey.size(),CV_16SC1);
    short* gradImg = (short*)GxGy.data;

    dirImg = new unsigned char[grey.cols*grey.rows];

    //Initialization of row = 0, row = height-1, column=0, column=width-1 
    for (int j = 0; j<grey.cols; j++)
        gradImg[j] = gradImg[(grey.rows - 1)*grey.cols + j] = gradThresh - 1;

    for (int i = 1; i<grey.rows - 1; i++)
        gradImg[i*grey.cols] = gradImg[(i + 1)*grey.cols - 1] = gradThresh - 1;

    for (int i = 1; i<grey.rows - 1; i++) {
        for (int j = 1; j<grey.cols - 1; j++) {
            int com1 = smoothImg[(i + 1)*grey.cols + j + 1] - smoothImg[(i - 1)*grey.cols + j - 1];
            int com2 = smoothImg[(i - 1)*grey.cols + j + 1] - smoothImg[(i + 1)*grey.cols + j - 1];
            int gx;
            int gy;
            gx = abs(com1 + com2 +  (smoothImg[i*grey.cols + j + 1] - smoothImg[i*grey.cols + j - 1]));
            gy = abs(com1 - com2 +  (smoothImg[(i + 1)*grey.cols + j] - smoothImg[(i - 1)*grey.cols + j]));
            int sum;
            if(0)
                sum = gx + gy;
            else
                sum = (int)sqrt((double)gx*gx + gy*gy);

            int index = i*grey.cols + j;
            gradImg[index] = sum;
            if (sum >= gradThresh) {
                if (gx >= gy) dirImg[index] = 1;//1 vertical
                else      dirImg[index] = 2;//2 Horizontal
            }
        }
    }
}


int main( int argc, char** argv )
{
    cv::Mat image;
    image = cv::imread(argv[1], cv::IMREAD_GRAYSCALE);   

    float sum=0;
    cv::Mat GxGy;
    for(int alpha = 0; alpha <20 ; alpha++)
    {
        TimerAvrg Fps;
        Fps.start();
        gradiantAndDirection(GxGy, image);
        Fps.stop();
        sum = sum + Fps.getAvrg()*1000;
    }

    std::cout << "\rTime detection=" << sum/19 << " milliseconds" << std::endl;
    cv::resize(GxGy,GxGy,cv::Size(image.cols/2,image.rows/2));
    cv::Mat result8UC1;
    cv::convertScaleAbs(GxGy, result8UC1);
    cv::imshow( "Display window",result8UC1);
    cv::waitKey(0);

    return 0;
}

I compile my code under Ubuntu 16.04 gcc 8.1.0

g++ -std=c++1z -fomit-frame-pointer -O3 -ffast-math -mmmx -msse -msse2 -msse3 -DNDEBUG -Wall improve_code.cpp -o improve_code -fopenmp `pkg-config --cflags --libs opencv`

What do you mean by "make it faster but keeping the same performance" - isn't that a contradiction? — Toby Speight
– Toby Speight, Commented Jan 9, 2019 at 10:54
@TobySpeight I think we can make a faster program but sometimes we loose the optimal performance like calling "a.empty()" instead of "a.size() == 0" .. — Ja_cpp
– Ja_cpp, Commented Jan 9, 2019 at 10:59

Toby Speight · Accepted Answer · 2019-01-09 14:22:06Z

Include what you use

We use std::abs and std::vector but the code lacks the necessary includes:

#include <cmath>
#include <vector>

Perform a single task well

Why do we perform the Gaussian blur? What if the input is already sufficiently blurred, or if it needs a bigger kernel?

If we make the caller responsible for this preparation step, we give it more control, and we can focus on a single responsibility in this function.

Fix the memory leak

This allocation is never released:

dirImg = new unsigned char[grey.cols*grey.rows];

In fact, this entire (global) array seems only to be assigned to, and never used, so we could remove it entirely.

Simplified indexing

It might be easier to use index throughout, and add rows or columns directly with ±grey.cols and ±1:

      const int index = i*grey.cols + j;
      int com1 = smoothImg[index + grey.cols + 1] - smoothImg[index - grey.cols - 1];
      int com2 = smoothImg[index - grey.cols + 1] - smoothImg[index + grey.cols - 1];
      int gx = std::abs(com1 + com2
                        + smoothImg[index + 1] - smoothImg[index + 1]);
      int gy = std::abs(com1 - com2
                        + smoothImg[index + grey.cols] - smoothImg[index - grey.cols]);

Use the standard library

There's no need to (badly) re-write std::hypot() for the computation of sum (also, drop the if (0) - that's always false).

      const int sum = std::hypot(gx, gy);

Saturate, don't overflow

The std::hypot() value could conceivably be larger than the range of int, or of gradImg[] (depending on the relative sizes of the target's integer types) - in such cases, we should std::clamp the value to the possible range, rather than suffering Undefined Behaviour.

Spelling

Prefer standard spelling: gradient, not ~~gradiant~~.

Stack Exchange Network

Optimizing gradient function in C++

1 Answer 1

Include what you use

Perform a single task well

Fix the memory leak

Simplified indexing

Use the standard library

Saturate, don't overflow

Spelling

You must log in to answer this question.

Hot Network Questions

Optimizing gradient function in C++

1 Answer 1

Include what you use

Perform a single task well

Fix the memory leak

Simplified indexing

Use the standard library

Saturate, don't overflow

Spelling

You must log in to answer this question.

Related

Hot Network Questions