A Simple C# Wrapper for Ghostscript
PDF thumbnails with Ghostscript
I’ve been looking for a while now for a simple solution for generating thumbnail images from PDF files. I wanted something that would let me programmatically load in a PDF file, choose a page, and generate a thumbnail from that page. As far as I can tell, there are only a few open source options and of those options I haven’t been able to find one that I could get working with C#.
After seeing it recommended a few times, I decided take a look at Ghostscript. Ghostscript is an open source interpreter for Postscript and PDF files. Among other things, Ghostscript allows you generate images from PDF pages. Which is exactly what I needed.
Ghostscript is a tool that can be used from the command line, which is how most of the examples I’ve found online have used it. Unfortunately, this is what a call to Ghostscript looks like:
gs -q -dQUIET -dPARANOIDSAFER -dBATCH -dNOPAUSE \
-dNOPROMPT -dMaxBitmap=500000000 -dFirstPage=1 \
-dAlignToPixels=0 -dGridFitTT=0 -sDEVICE=jpeg \
-dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r100x100 \
-sOutputFile=output.jpg input.pdf
Not pretty. Luckily, I needed to automate the task of creating the thumbnails, so I wouldn’t need to manually generate the parameters to be passed to the command line tool. However, I still felt like there might be a better way to hook into Ghostscript’s functionality. So, I decided to take advantage of the API provided by Ghostscript by writing a simple C# wrapper for the API to use in my current ASP.Net project.
A simple Ghostscript wrapper
The first thing I needed was the Windows version of the Ghostscript DLL, which can be obtained here. Once I included the DLL in my project, I needed to expose the unmanaged API functions to my C# wrapper function.
[DllImport("gsdll32.dll", EntryPoint = "gsapi_new_instance")]
private static extern int CreateAPIInstance(out IntPtr pinstance,
IntPtr caller_handle);
[DllImport("gsdll32.dll", EntryPoint = "gsapi_init_with_args")]
private static extern int InitAPI(IntPtr instance, int argc, IntPtr argv);
[DllImport("gsdll32.dll", EntryPoint = "gsapi_exit")]
private static extern int ExitAPI(IntPtr instance);
[DllImport("gsdll32.dll", EntryPoint = "gsapi_delete_instance")]
private static extern void DeleteAPIInstance(IntPtr instance);
Above, I complained about the long list of parameters that need to be passed to the Ghostscript command line tool. Those same parameters need to be passed to the API, so the next thing I did was create a function that wrapped up the functionality for building the list of parameters. For simplicity, I left in a lot of default parameters, but the function could be expanded later on to allow more specific parameters.
private string[] GetArgs(string inputPath, string outputPath,
int firstPage, int lastPage, int width, int height)
{
return new[]
{
// Keep gs from writing information to standard output
"-q",
"-dQUIET",
"-dPARANOIDSAFER", // Run this command in safe mode
"-dBATCH", // Keep gs from going into interactive mode
"-dNOPAUSE", // Do not prompt and pause for each page
"-dNOPROMPT", // Disable prompts for user interaction
"-dMaxBitmap=500000000", // Set high for better performance
// Set the starting and ending pages
String.Format("-dFirstPage={0}", firstPage),
String.Format("-dLastPage={0}", lastPage),
// Configure the output anti-aliasing, resolution, etc
"-dAlignToPixels=0",
"-dGridFitTT=0",
"-sDEVICE=jpeg",
"-dTextAlphaBits=4",
"-dGraphicsAlphaBits=4",
String.Format("-r{0}x{1}", width, height),
// Set the input and output files
String.Format("-sOutputFile={0}", outputPath),
inputPath
};
}Once I had a way of creating a list of parameters, I could start using the Ghostscript API functions. I created a function called CallAPI that would accept an array of parameters and use them to call the Ghostcript API.
The function I created for building a list of arguments returned an array of strings, but to use the API I needed to convert each of those parameters into a ANSI null terminated byte array (I added the code I used to do this to the bottom of this post). Then I needed to allocate some space in memory for each of those arguments and get pointers to each one of them.
var argStrHandles = new GCHandle[args.Length];
var argPtrs = new IntPtr[args.Length];
// Create a handle for each of the arguments after
// they've been converted to an ANSI null terminated
// string. Then store the pointers for each of the handles
for (int i = 0; i < args.Length; i++)
{
argStrHandles[i] = GCHandle.Alloc(StringToAnsi(args[i]), GCHandleType.Pinned);
argPtrs[i] = argStrHandles[i].AddrOfPinnedObject();
}
// Get a new handle for the array of argument pointers
var argPtrsHandle = GCHandle.Alloc(argPtrs, GCHandleType.Pinned);Then, to use the newly converted parameters, I needed to create an instance of the Ghostscript API and pass them into the initialization function.
// Get a pointer to an instance of the GhostScript API
// and run the API with the current arguments
IntPtr gsInstancePtr;
CreateAPIInstance(out gsInstancePtr, IntPtr.Zero);
InitAPI(gsInstancePtr, args.Length, argPtrsHandle.AddrOfPinnedObject());The call to InitAPI runs Ghostscript and generates any requested files at the output path.
Now the only remaining thing I needed to do was clean up the memory that was allocated for the API. To handle this, I wrote a cleanup function that takes in the items that need to be cleaned up. The API provides some cleanup functions, so I called those in the cleanup function as well.
private void Cleanup(GCHandle[] argStrHandles, GCHandle argPtrsHandle,
IntPtr gsInstancePtr)
{
for (int i = 0; i < argStrHandles.Length; i++)
argStrHandles[i].Free();
argPtrsHandle.Free();
ExitAPI(gsInstancePtr);
DeleteAPIInstance(gsInstancePtr);
}One last thing I added to the wrapper was a simple function for generating thumbnails from a source PDF file. Technically, I could have just used the CallAPI function to do that, but I wanted to hide the details of working with the API from code outside of the wrapper class.
public void GeneratePageThumbs(string inputPath, string outputPath,
int firstPage, int lastPage, int width, int height)
{
CallAPI(GetArgs(inputPath, outputPath, firstPage, lastPage, width, height));
}The GeneratePageThumbs doesn’t do anything other than calling the CallAPI function. However, in the future, I’d like to provide other functions that use the Ghostscript API as well. If anyone has any ideas for improving the code, drop me line.
Update: Here is the code I used to convert the arguments to null terminated byte arrays. There might be a better way to do this in .Net, this is just the quick solution I’m using.
public static byte[] StringToAnsi(string original)
{
var strBytes = new byte[original.Length + 1];
for (int i = 0; i < original.Length; i++)
strBytes[i] = (byte)original[i];
strBytes[original.Length] = 0;
return strBytes;
}
Tags: Ghostscript
February 12th, 2009 at 7:31 pm
Hi there,
Nice piece of work here, is exactly the same thing I am working on. Would it be possible to make the source code available for download somewhere, as some parts have not been outlined (StringToAnsi)?
Cheers,
Brendon
February 13th, 2009 at 10:53 am
I added the code for StringToAnsi to the bottom of the post.
When I have time, I’m probably going to spin this code off into an open source project, since I’ve gotten a lot of questions about it.
February 21st, 2009 at 10:57 am
Hi,
About your .NET Ghostscript Wrapper have you created the callback functions to show the file on a PictureBox?
Thanks
February 27th, 2009 at 6:01 pm
No, I haven’t created any callback functions for that. Pretty much what you see here is what I’ve written. It’s only focused on generating the PDF page preview.
That would be interesting functionality, I just didn’t need it for what I was doing.
March 3rd, 2009 at 5:05 am
If you want thumbnails from PDF files, you have Cairographics PDF backend as working C# solution. Although I’ve managed to make it work only with Mono, not with Visual Studio.
March 5th, 2009 at 10:10 am
Nice piece of work here but i have some problems can you send me a complete sample.
thanks.
March 7th, 2009 at 11:36 am
Hi,
Great job with the wrapper. Just what I needed.
I just wanna point out a couple of things that I ran into:
You need either the full package of GhostScript installed on the client computer or you can unzip all the files from the lib and Resource folders in the gs863w64.exe file into the same folder as gsdll32.dll (all files in the same folder - moved from the subfolders).
Also the resolution you specify with int width, int height is in DPI not pixels so depending on output it’s probably easiest to choose a resolution between 36×36 and 288×288 and use GDI to resize the image to what you really need.
March 9th, 2009 at 5:14 pm
You’re right about the height and width. I actually ended up starting with a higher resolution image and then used GDI to resize the larger image like you said. I’m hoping to include the code if and when I put together a little open source project for this.
March 17th, 2009 at 3:44 pm
The code works great from a command prompt, thank you! Unfortunately, when I use it within a web service (IIS , Asp.Net MVC) I get a -100 from gsapi_init_with_args. Am I missing a parameter somewhere to get this to run as part of a service?
March 20th, 2009 at 9:40 am
Hi there, Good article! At what time do you recon the code will be available?
April 8th, 2009 at 9:39 am
Matthew, I am working on a project part of which I need to convert postscript to an pdf file within C#. Your code and comments from others looks promising for me to use. We would appreciate if you can provide us the source code for the wrapper.
April 13th, 2009 at 11:58 pm
@Nanda Motikane The source code on here is pretty much all there is to it. Aside from some application specific stuff that I added for what I was working on.
April 29th, 2009 at 3:02 pm
Thanks for the terrific script Matthew. I have it working on an ASP.NET page, Framework version 2.0.
A couple of notes on changes I made to get it working:
- I needed to put gsdll32.dll in my system32, or alternatively, load it with LoadLibrary.
- I needed to change the line return new[] to actually declare a temp string[] and return it explicitly at the bottom of the function.
- I added a call to Cleanup() at the bottom of your CallAPI(), because you must have called it somewhere outside of your code snippets.
- I replaced your instances of “var” to appropriate type names. “var” must have represented something specific in your system that’s not native to .NET I guess.
Thanks again for showing us how to implement something that should be truly simple but is, in fact, disgustingly difficult and frustrating!
May 13th, 2009 at 11:47 am
I have another tip for refining the thumbnail creator further: use ghostscript’s -dUseCropBox option. This will create a proper thumbnail image for pdfs whose cropbox is set differently than the mediabox. I’ve converted about 20,000 pdfs so far - and while there aren’t many that do this, there are some.
May 14th, 2009 at 8:24 am
You’re DLLImport statements could really be simplified to the following:
[DllImport("gsdll32.dll", EntryPoint = "gsapi_new_instance")]
private static extern int CreateAPIInstance(out IntPtr pinstance,
IntPtr caller_handle);
[DllImport("gsdll32.dll", EntryPoint = "gsapi_init_with_args")]
private static extern int InitAPI(IntPtr instance, int argc, string[] argv);
[DllImport("gsdll32.dll", EntryPoint = "gsapi_exit")]
private static extern int ExitAPI(IntPtr instance);
[DllImport("gsdll32.dll", EntryPoint = "gsapi_delete_instance")]
private static extern void DeleteAPIInstance(IntPtr instance);
This would remove the need for some of your more complicated code (including StringToAnsi).
Thanks for all your hard work!
May 19th, 2009 at 6:00 am
hi,
can you help me detect the colored pages in a pdf files using this wrapper?
thanks
May 20th, 2009 at 4:44 pm
Thank you to everyone who has posted improvements to this code. Rather than continually updating this blog post, I’m planning on putting together an open source project some time in the near future that will wrap up this code. That way, anyone who has improvements can work on the code.
June 9th, 2009 at 3:59 pm
I’d like to also say thanks for this solution. I too simply needed something to generate thumbs from PDFs, and this works perfect. You rock!