<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Using GPUs to accelerate neuroimaging</title>
	<atom:link href="http://www.cabiatl.com/gpu/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://www.cabiatl.com/gpu</link>
	<description>Harness graphics cards for processing brain images</description>
	<lastBuildDate>Thu, 08 Apr 2010 18:06:20 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Optimized new segment routine in SPM</title>
		<link>http://www.cabiatl.com/gpu/?p=49</link>
		<comments>http://www.cabiatl.com/gpu/?p=49#comments</comments>
		<pubDate>Thu, 08 Apr 2010 18:06:20 +0000</pubDate>
		<dc:creator>author</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cabiatl.com/gpu/?p=49</guid>
		<description><![CDATA[Porting new segment routine to Jacket m-code.]]></description>
			<content:encoded><![CDATA[<p>Jacket is a GPU Computing Engine for Matlab. It provides the facility of creating modules using Matlab m-code which operates on the GPU. The only requirement for this is to first push data to be operated on, onto the GPU using simple functions like gdouble, gsingle etc.</p>
<p>We are currently porting the new segment Matlab routine in SPM8 to equivalent Jacket m-code to accelerate the routine as a whole. Performance results will be coming soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cabiatl.com/gpu/?feed=rss2&amp;p=49</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>B-Spline Acceleration using CUDA and Jacket</title>
		<link>http://www.cabiatl.com/gpu/?p=39</link>
		<comments>http://www.cabiatl.com/gpu/?p=39#comments</comments>
		<pubDate>Thu, 08 Apr 2010 17:51:42 +0000</pubDate>
		<dc:creator>author</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cabiatl.com/gpu/?p=39</guid>
		<description><![CDATA[Performance comparison of various MEX implementations for Bspline Interpolation]]></description>
			<content:encoded><![CDATA[<p>Often the deal-breaker for using GPUs for accelerating a task is the exorbitant cost of transferring data from the CPU to the GPU and vice-versa. To alleviate this cost to a certain extent we decided to use the Matlab GPU Computing Engine &#8211; Jacket by Accelereyes; to enhance our CUDA MEX file for bspline interpolation. For comparison, we also created an implementation of bspline interpolation MEX file using CPU multi-threading using pthreads. We decided to first concentrate on a popular SPM routine &#8220;new segment&#8221; which provides some additional benefits over normal unified segmentation such as (i) a slightly different treatment of the mixing proportions, (ii) the use of an improved registration model, (iii) the ability to use multi-spectral data, (iv) an extended set of  tissue probability maps, which allows a different treatment of voxels outside the brain.</p>
<p>The results for the comparison study are as follows. Over a baseline CPU implementation running on Intel Quad Core Nehalem operating at 2.6Ghz, a CPU multi-threaded version gives 1.2X speedup while the CUDA MEX implementation enhanced with Jacket gives a 17X speedup.</p>
<p><img class="size-full wp-image-40 alignleft" title="Performance comparison" src="http://www.cabiatl.com/gpu/wp-content/uploads/2010/04/Slide1.jpg" alt="Performance comparison" width="518" height="389" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.cabiatl.com/gpu/?feed=rss2&amp;p=39</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Profiling SPM</title>
		<link>http://www.cabiatl.com/gpu/?p=23</link>
		<comments>http://www.cabiatl.com/gpu/?p=23#comments</comments>
		<pubDate>Fri, 30 Oct 2009 10:34:35 +0000</pubDate>
		<dc:creator>author</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cabiatl.com/gpu/?p=23</guid>
		<description><![CDATA[Profiling helps determine bottlenecks in the SPM processing pipeline]]></description>
			<content:encoded><![CDATA[<p>The start of the journey of speeding up an application is <a href="http://en.wikipedia.org/wiki/Profiling_(computer_programming)" target="_blank">profiling</a> it to have an idea about which parts of the application consume most of the time and consequently would provide maximum benefits on parallelizing.</p>
<p>SPM being a Matlab based software, the intuitive choice of tool to profile it was the &#8216;Matlab Profiler&#8217;. The &#8216;Matlab Profiler&#8217; is a wonderful feature which provides  plethora of useful profiling information for your Matlab application. It is more like an API consisting of some functions which can be &#8216;plugged in&#8217; the Matlab code. It records information about execution time, number of calls, parent functions, child functions, code line hit count, and code line execution time. It can also save a comprehensive profiling report into an html file for future reference. For those interested further, this <a href="http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/profile.html&amp;http://www.google.com/search?client=safari&amp;rls=en&amp;q=matlab+profiler&amp;ie=UTF-8&amp;oe=UTF-8">link</a> from the Mathworks Website does a wonderful job of explaining the profiler immaculately.</p>
<p>So I profiled the SPM fMRI workflow with the &#8216;Matlab Profiler&#8217;. The way the profiler works demands us to create a batch file of the entire workflow so that one can &#8216;plug-in&#8217; profiling chunks of code between calls to the workflow stages. The Batch interface tab provided in SPM can be used to create a Batch file easily for our purposes. The dataset I used for profiling SPM fMRI was the famous/non-famous face repetition example<span style="font-family: Helvetica, 'Times New Roman', 'Bitstream Charter', Times, serif; color: #f2007d;"><span style="line-height: normal;"> <span style="color: #1e1e1e;">at this <a href="http://www.fil.ion.ucl.ac.uk/spm/data/face_rep/">link</a></span></span></span></p>
<p><span style="font-family: Helvetica, 'Times New Roman', 'Bitstream Charter', Times, serif; color: #f2007d;"><span style="line-height: normal;"><span style="color: #1e1e1e;">The results obtained from profiling are as follows:</span></span></span></p>
<p><span style="font-family: Helvetica, 'Times New Roman', 'Bitstream Charter', Times, serif; color: #1e1e1e;"><span style="line-height: normal;"><img class="alignnone size-full wp-image-25" title="Profiling Result" src="http://www.cabiatl.com/gpu/wp-content/uploads/2009/10/profile.jpg" alt="Profiling Result" width="362" height="218" /></span></span></p>
<p><span style="font-family: Helvetica, 'Times New Roman', 'Bitstream Charter', Times, serif; color: #1e1e1e;"><span style="line-height: normal;">From the above pie chart it can be seen that Segmentation, Model Estimation and Re-alignment are the most expensive operations. Since Model Estimation is very case specific we neglect it for now and focus our attention on Segmentation and Re-alignment which have a fixed mathematical algorithm and are good candidates to parallelize. </span></span></p>
<p><span style="font-family: Helvetica, 'Times New Roman', 'Bitstream Charter', Times, serif; color: #1e1e1e;"><span style="line-height: normal;">In course of profiling it was also observed that B-spline interpolation consumed about 18% of the entire workflow with contributions to the Segmentation and Re-alignment stages. Also B-spline interpolation is implemented as a MEX-file written in C within SPM and consequently can be re-written in CUDA and can be incorporated into SPM using the &#8216;nvmex&#8217; utility provided by nVidia. So I decided to start off with implementing B-spline in CUDA towards speeding up the SPM software.</span></span></p>
<p><span style="font-family: Helvetica, 'Times New Roman', 'Bitstream Charter', Times, serif; color: #1e1e1e;"><span style="line-height: normal;">More on this to be coming soon&#8230;</span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.cabiatl.com/gpu/?feed=rss2&amp;p=23</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SPM B-Spline Interpolation</title>
		<link>http://www.cabiatl.com/gpu/?p=12</link>
		<comments>http://www.cabiatl.com/gpu/?p=12#comments</comments>
		<pubDate>Fri, 16 Oct 2009 13:08:00 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cabiatl.com/gpu/?p=12</guid>
		<description><![CDATA[Faster reslicing by using a GPU-based mex file.]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.fil.ion.ucl.ac.uk/spm/">SPM</a> is a popular tool for processing brain imaging data. The steps required for image analysis are very computer intensive, and many people are investigating how to make <a href="http://en.wikibooks.org/wiki/SPM/Faster_SPM">SPM faster</a>.</p>
<p>We have used GPUs to accelerate one stage of the SPM pipeline. A common task used in many SPM analyses is the <a href="http://en.wikipedia.org/wiki/Spline_interpolation">B-spline interpolation</a> which is used during image reslicing. We have created a  mex that uses the GPU and is written in CUDA. By replacing the conventional CPU-based mex file with our new GPU-based mex file we can accelerate this important step.</p>
<p>The left column of the image below shows 2D interpolation, with the rows showing nearest neighbor, linear and a higher-order b-spline. Note that the nearest neighbor appears jagged and the linear appears blurry, whereas the b-spline retains more high-frequency information. The right column shows the filtering kernel for each type of interpolation.</p>
<div style="max-width: 353px; min-width: 5.5em;"><img style="max-width: 353px; max-height: 452px;" title="Logo" src="http://www.cabiatl.com/gpu/wp-content/uploads/2009/10/interpolation.png" border="0" alt="[Image]" /></div>
<div style="max-width: 353px; min-width: 5.5em;"></div>
<div style="max-width: 353px; min-width: 5.5em;"></div>
]]></content:encoded>
			<wfw:commentRss>http://www.cabiatl.com/gpu/?feed=rss2&amp;p=12</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tesla hardware arrives</title>
		<link>http://www.cabiatl.com/gpu/?p=1</link>
		<comments>http://www.cabiatl.com/gpu/?p=1#comments</comments>
		<pubDate>Thu, 15 Oct 2009 22:13:07 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cabiatl.com/gpu/?p=1</guid>
		<description><![CDATA[The CABI has received NVidia Tesla equipment, offering performance historically only seen in supercomputers.]]></description>
			<content:encoded><![CDATA[<p>The center for advanced brain imaging has received two <a href="http://www.nvidia.com/object/tesla_computing_solutions.html">NVidia Tesla S1070</a>  high performance computers. These systems will allow us to develop software that can use harness the power of <a href="http://gpgpu.org/">General-purpose computing on graphics processing units (GPGPUs)</a>. The software we develop will run on any modern NVidia graphics processing unit (GPUs), including those included in many laptops and desktop computers. However, the Tesla system provides exceptional performance, leveraging many GPUs simultaneously for processing brain imaging.</p>
<div style="max-width: 320px; min-width: 5.5em;">
<img style="max-width: 320px; max-height: 296px;" src="http://www.cabiatl.com/gpu/wp-content/uploads/2009/10/tesla_s1070_prod_shot_2.jpg" border="0" margin="10" title="Logo" alt="[Image]" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.cabiatl.com/gpu/?feed=rss2&amp;p=1</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

