SpMV matrix multiplication CSR format using tiles
hi everybody,Does anyone know how to use tiles for CSR Matrix format multiplication, here is the serial code on C++:for (int i=0; i<_nb_rows; ++i) for (int j=_row_ptr[i]-1;...
View Articledifferent Data results between Debug and Release
Hi everybody,I'm running some particular matrix multiplication using only 1 dimension and tiles, and problem is that I'm not getting the same last element of the result vector when running Debug GPU...
View ArticleRTOS RTX API coding examples
I'm looking for RTX API (see Reference guide) coding examplesin various languages. I hoping for short independent examples for everything in the API.Is RTX 5.0 the latest version of the API.Reference...
View ArticleNested parallel_for_each or similar constructs
Hi there.I need to perform an operation as illustrated in the below code. Is there a way to structure this code so that I can call the two operations in some sort of nested p_f_e loops?I do not want to...
View ArticleGetDeviceProperties for AMP accelerator ?
Is there any way to get more detailed accelerator properties in AMP?Specifically, how can I get number of supported concurrent threads on GPU/accelerator?In CUDA I used GetDeviceProperties, and it...
View ArticleVector Types Supported by C++ AMP
I am learning Open CL and C++ AMP, so am new to both.....I like C++ AMP because it uses modern C++.....But I did notice that Open CL has more vector types...ie.... int2, int4, int8, int16 it appears...
View ArticleAMP arrays in fast shared memory ?
If I have something like this:parallel_for_each(av.extent, [=](index<1> idx) restrict(amp) { int n; int cnt[8]; // some simulation code that use both n and cnt for (n=0; n< scope; n++) if...
View ArticleDebugging AMP code on DirectX resources
For my Project I receive BGRA32 images from a Camera as void* pointers.I'd like to use two images and either show bothe side by side on screen, or do some calculation with amp that leads to one...
View ArticleMy CPU outperforms my GPU...is anything wrong with code?
I just wanted to test performance of CPU vs GPU....Here are my results of test CPU took about 1 secondGPU took about 2.66 secondsATI Radeon HD 5700 SeriesDoTestMathOnCPU start seconds it took to...
View ArticleArrays of array_view
Hi,Hopefully this isn't a particularly dumb question...But, how can I use an array of array_view (or just amp arrays) in a p_f_e block? My algorithm (in a nutshell) has a sequence of operations passed...
View ArticleWhat do you want in the next version of C++ AMP? – we are listening
Visual Studio 2012 includes the first release of the C++ AMP technology and hopefully by now you have had a chance to learn about and even better try your hands at it. We would like you to know that...
View ArticleFastest way to copy array in C++Amp
Hello everyoneI want to copy an array of bytes to another one.It seems that "std::copy" or "concurrency::copy" are faster than copying within C++Amp, Right????Here is the codestd::copy(buffer, buffer +...
View ArticleI can debug kernel in one project but not kernel in another one
I have two c++ projects 1) an example c++ Amp program from msdn2) my own c++ Amp test codeI can step in kernel and debug #1, but I can't debug kernel in #2I get hollow break points in c++ code as well...
View ArticleWhere are captured variables stored?
In function below1) Where is "x"? Is it in Global GPU memory space?2) Where is "y"? Is it in Global memory or local memory? (in my lambda)In OpenCL I can specify global or local for variables in...
View ArticleWhy unbounded_buffer lost data values in my test sample
Hi, I got one problem when I try to use agent and unbounded_buffer. Here is the test sample, one unbounded_buffer<double> object is used to transmit data values from agent a1 to a2. However, some...
View ArticleLoadString() is not successfull when calling thread is not main thread
I am using parallel_for() to run part of my program on multiple threads. If there is a problem in our calculations, we write one of our warning to a buffer in memory. Our warnings are in our .rc file....
View ArticleC++AMP-ish Direct3d
Hi!I had an idea that came into my mind when I was posting on isocpp.org forums about upcoming C++14 features, and I thought I'd share my idea here and ask for the opinions of fellow programmers:How...
View ArticleParallel_for design and implementation issues
Hello everyone, This is my story: I have N x n iterations. The inner loops are used the calculate the number s*, the outer N loops are used to collect and sum these numbers. I would like to spread the...
View ArticleHow are tiles passed to GPU?
If say I have 1024 x 1024 2D array.If say, I tile it into 32 x 32 tiles....Is the whole 2D arrays sent? or are 32 x 32 bite sized chunks sent to GPU one at a time?
View Articlestd::vector overflows array
When my program executes and hits this loop:i = 0; for (vector<Particle>::iterator I = Particles.begin(); I != Particles.end(); I++) { bool Did_Surf = false; if (I->Invalidate() == false) {...
View Article