Loop 1 ( N times, where N is the number of pixels in the out put picture)
{
Loop 2 ( M times, where M is the number of pixels the input texture have)
{
Loop3 (c times, where c is the size of the neighbor)
{
simple instructions of multiplication and addition. (O(1))
}
}
}
Considered that the loop2 is a full search to find the nearest neighbor, I simply drag all the loop into GPU. I save the input texture patch into the device memory, and calculate the distances betweent a specific output pixel's neighbor and the pixels' neighbors in input texture parallelly.
The performance has been improved obviously as following table:



No comments:
Post a Comment