Soares $90 CD Document Dump (Updated)


Albany County Clerk Thomas Clingan just released a CD that cost $90 for the public and the press of 8,562 documents from District Attorney David Soares’ investigation of Troopergate.

I’m just starting to go through it, and it’s an absolute mess. Every page is it’s own pdf file and not even scrollable or searchable. So it appears every page is its own file, making it that much more difficult to review. None of the files are titled, there’s no index.

Soares was under orders from the state to release testimony he received from former Gov. Eliot Spitzer and his aides, after Soares fought the documents’ release.

Updated: Soares aides said that there was no order to release the documents, just an advisory opinion from the state Commission on Open Government.

So the county clerk hired a company to scan all the files (which are in boxes pictured above and put everything on a disc (which is left), then charged back the media for the expense.

Soares conducted two investigations into whether Spitzer and aides conspired to release travel documents on former Senate Majority Leader Joseph Bruno’s use of state aircraft.

Soares found no criminal wrongdoing in either probe. Last month, the state Commission on Public Integrity charged four Spitzer aides with civil violations of the state’s Public Officers Law.


About Author


  1. “Every page is it’s own pdf file…”
    or “Every page is its own pdf file…”

    Also, any clause (like the one after LEAVE A REPLY) before a colon should be a complete sentence (with a subject, a verb, and an object). You could just take that colon out or you could add language to make that clause complete.

  2. Actually you can search across all files using keyword search as part of the Windows explorer search feature, or you can combine up all of pdf files into one file and use adobe acrobat search. then bookmark key items in the file

    if they had added indexing and organization to the CD you would have paid far more than $90

    to create an index you have to analyse the documents identify metadata or properties that you want to capture. You have to identify document types such as a letter, contract etc. once you do that you have to pay someone to do data entry, then you need someone to do a QC on the data entry.

    the county clerk did an excellent job cost wise since they basically paid 1 cent per document to have them scanned. Usually you are charged per page and a document doesn’t necessarily equate to one page. An additional factor that must be taken into account is preparing the documents for scanning. This means removing staples, paperclips, binderclips in other words any sort of paper fastener. Then you have to flatten each piece of paper, you have to arrange and then insert into the scanner. I’ll assume that they used a high speed scanner which helped to hold down the cost.

    so if you want a searchable organized set of documents be prepared to pay a much much higher price. You get what you pay for.