Batch .Doc to .PDF conversion - HotUKDeals
We use cookie files to improve site functionality and personalisation. By continuing to use HUKD, you accept our cookie and privacy policy.
Get the HUKD app free at Google Play

Search Error

An error occurred when searching, please try again!

Login / Sign UpSubmit

Batch .Doc to .PDF conversion

allsa001 Avatar
6y, 1m agoPosted 6 years, 1 month ago
Very techy here - I do apologise

Hi guys,
I have a file structure that contains lots of subfolders, and in these subfolders I have various Word documents.

What I am after is a way/programme to convert these word documents to pdf, whilst still retaining my original file structure?

I have had a few ideas, but mainly fail at retaining the original file structure. Any ideas anyone, very much appreciated.
allsa001 Avatar
6y, 1m agoPosted 6 years, 1 month ago
Options

All Comments

(28) Jump to unreadPost a comment
Comments/page:
#1
Hi,

Do you wish to retain the original ".doc" files?
Do you already have a Portable Document Format conversion tool, or are you looking for suggestions for this (too; that may already have a batch conversion feature)?

Presumably you have Microsoft Word installed on your PC too. Is that the case?

BFN,

fp.
#2
H FP,
Thanks for the quick response. I do have word installed on my machine aswell as Adobe Writer.
I don't mind retaining the original ".doc files, as if I was certain it worked, I could then search for these files and delete them afterwards.

Usually when I have opened the document in word, I have a printer installed that converts to PDF, you just ned to specify the location. MY initial idea was to set this printer as my default, I could also set the port of this printer to a speciefied output folder. Then Ctrl&F all *.doc files (There are 3888 of them) and select all and print. I would have to press the enter key 3888 times to print to the above mentioned directory (a prompt that pops up), but that would not be a problem at all.

The problem with the above method is it will put all the PDF's into one folder. I really need to retain the original file structure though, so this idea kind of fails.
#3
Hi,

Does Adobe Writer have an interface you can use from Visual Basic for Applications [VBA] that you know of?

I was thinking that you (&/or I) could write some code that would trawl through your folders, convert the ".doc" file(s) found to a corresponding ".pdf" & place the result in the same folder (for you to then review & remove the original “.doc” file later).

Failing the existence of a bespoke Application Programming Interface [API], each of the files could just be opened, printed to the designated Windows Printer that redirects to the Adobe Writer software, & the output “.pdf” file could then be moved to the appropriate folder (where the ".doc" resides).

I do not have access to Adobe Writer, unfortunately, but if you could start recording a Macro in Word, open one of your ".doc" files, print it to the appropriate installed Printer, & then stop recording. If you take a look at the resultant Visual Basic for Applications code generated (& post it in a comment) it may help (you/I/us) automate the process.

BFN,

fp.
#4
I have some vba that I can send you in the am that will do this for you as do the same in work
#5
Thanks for all of your support guys, it's all very much appreciated, and should ave me a hell of a lot of time.

fp, the only thing that I can see that maybe a problem is that alot of the files will have the same filename. It's usually their whole filepath that specifies what it is. Very bad filenaming I know!

Will then have bad implicattions for your idea?

coddfather, thanks for your assistance
#6
Hi a little bit of an update guys,
althoug hI have not got any further at all.

I am using a seperate piece of software that searches through all my subfolders for any word documents, and converts them to PDF files in a seperate output folder. However it does keep the original file name, just changes the *.doc to *.pdf.

Therefore I have 2 different directories:
Directory 1 - contains all my neatly organised subfolders that contain 3888 word documents within the subfolders

Directory 2 - contains 8333 pdf documents all in it's root directory. These PDF's share the filenames of their corrosponding word documents that are neatly organised within subfolders in Directory 1.

Can you think of any VB (?) that will run ina loop to go through each PDF in folder 2, and paste this file into the subfolder in directory 1 that contains the word document with the same filename?

Ignore my earlier post about files having the same name, I will ammend this by hand to ensure that each file has a unique filename.

I hope this makes sense, if this is possible I would majorily appreciate all of your help.
#7
Do you wish to wait for coddfather's offer of some suitable code to address your issue before discussing further (in case no further input is required)?

Do you have the source code for the software you mentioned that converts the files to the output folder?

If so, then we can amend this to either store the full path & filename of the source file & then write a similar routine to reinstate (copy/move) the resultant files back to their respective original location, or simply change the code to not store the output in the single folder but place the output in the same folder as the source document.

If coddfather's code also converts the Microsoft Word ".doc" document files to Adobe Portable Document Format ".pdf" files, then you may well having a working solution tomorrow (morning).

However, if the conversion is not performed, then we can look at other methods to achieve your desired result.

Either way, I can help with the coding of whatever method you determine is the way forward for your needs.

Do you have a definitive deadline for the 'project'?

BFN,

fp.
#8
Fanpages, I cannot stress what a great help your being.
I think it best to see what happens tomorrow, and then to see what the best way forward is.

I really didn't think I was going to have much luck with this, but with both of your help I think this is now going to be possible.

In terms of deadline, it's by no means urgent, the sooner the better of course, but as long as I stop saving in word from now on and start converting to PDF, it will just mean that I have a backlog of 3888 documents to sort through. So to have this automatically completed will be a massive time saver. Plus I really enjoy learning to skills, I really wouldn't mind doing this task manually, but to learn how to do it a better way really make my day.

Thanks again guys.
#9
Hopefully this should do what you need, this also combined a word doc and xl spreadsheet together and created a pdf, have hopefully commented out all the excel parts, then it use to place the original docs in a zip file and delete them.

You will need to change the file paths, also this uses a free program called pdf creator, just easier for us especially if we need to distribute as not everyone have adobe writer.


Option Compare Database
Option Explicit
Public vitem, vFile_Name, vTime_Stamp, sWinZip, sZipFile, sFileToZip As String, i(1 To 2) As Long
Function Convert_To_PDF_Excel_Word()
'Call Process_Progress("Converting Final TMAs to PDF.", "", dTime(1))
i(2) = 0
'Call Constants
Try_Again:
Call Kill_Processes
Dim WordObj As Object
Set WordObj = CreateObject("Word.Application")
Dim XLObj As Object
Set XLObj = CreateObject("Excel.Application")

Dim strDefault_Printer As String
strDefault_Printer = WordObj.ActivePrinter
With Application.FileSearch
.Filename = "*.doc"
.LookIn = strLocal_Path & "Outputs\TMA_Draft_Documents"
.SearchSubFolders = False
.Execute
i(1) = .FoundFiles.Count
For Each vitem In .FoundFiles

vFile_Name = Mid(vitem, InStrRev(vitem, "\") + 1, Len(vitem) - (InStrRev(vitem, "\") + 4))

If (Left(vFile_Name, 1) = "~") Then

SetAttr vitem, vbNormal
Kill vitem

WordObj.Quit
Set WordObj = Nothing
XLObj.Quit
Set XLObj = Nothing
GoTo Try_Again

End If

'If (Left(Right(vFile_Name, Len(vFile_Name) - InStrRev(vFile_Name, "_")), 1) = "v") Then

'If IsNumeric(Right(Right(vFile_Name, Len(vFile_Name) - InStrRev(vFile_Name, "_")), Len(Right(vFile_Name, Len(vFile_Name) - InStrRev(vFile_Name, "_"))) - 1)) Then

Dim pdfjob As PDFCreator.clsPDFCreator
Set pdfjob = New PDFCreator.clsPDFCreator

WordObj.Visible = True
WordObj.Documents.Open strLocal_Path & "Outputs\TMA_Draft_Documents\" & vFile_Name & ".doc"

'XLObj.Visible = True
'XLObj.Workbooks.Open strLocal_Path & "Outputs\TMA_Draft_Documents\" & vFile_Name & '"_Summary.xls"

If pdfjob.cStart("/NoProcessingAtStartup") = False Then

GoTo Try_Again

End If

'Call Process_Pause

With pdfjob

'Set Values
.cOption("UseAutosave") = 1

If (.cOption("UseAutosave") = "Empty") Then

MsgBox "Error", vbOKOnly, "Error"

End If

.cOption("UseAutosaveDirectory") = 1
.cOption("AutosaveDirectory") = strLocal_Path & "Outputs\TMA_Draft_Documents"
.cOption("AutosaveFilename") = vFile_Name
.cOption("AutosaveFormat") = 0

'The following are required to set security of any kind
.cOption("PDFUseSecurity") = 1
.cOption("PDFOwnerPass") = 1
.cOption("PDFOwnerPasswordString") = "irotma"

'To set individual security options
.cOption("PDFDisallowCopy") = 1
.cOption("PDFDisallowModifyContents") = 1
.cOption("PDFDisallowPrinting") = 0

.cVisible = False

.cClearCache

'Save Values
.cSaveOptions
.cPrinterStop = True
End With

WordObj.ActivePrinter = "PDFCreator"
WordObj.ActiveDocument.PrintOut , Copies:=1

'Call Process_Pause

Do Until pdfjob.cCountOfPrintjobs = 1
DoEvents
Loop

'XLObj.ActiveSheet.PrintOut Copies:=1, ActivePrinter:="PDFCreator on Ne00:"

'Call Process_Pause

'Wait until the print job has entered the print queue
'Do Until pdfjob.cCountOfPrintjobs = 2
'DoEvents
'Loop

'Combine all PDFs into a single file and stop the printer
'With pdfjob
'.cCombineAll
'.cPrinterStop = False
'End With

'Wait until PDF creator is finished then release the objects
Do Until pdfjob.cCountOfPrintjobs = 0
DoEvents
Loop

'Wait until the PDF file shows up then release the objects
Do Until Dir(strLocal_Path & "Outputs\TMA_Draft_Documents\" & vFile_Name & ".pdf") ""
DoEvents
Loop

'Wait a bit longer for PDF Creator to finish
Call Process_Pause

If (i(2) = i(1) - 1) Then

WordObj.ActivePrinter = strDefault_Printer
'XLObj.ActivePrinter = strDefault_Printer
End If

WordObj.ActiveDocument.Close savechanges:=False

XLObj.ActiveWorkbook.Close savechanges:=False

pdfjob.cClose
Call Process_Pause
Set pdfjob = Nothing

Continue:

'ZIP FILE
'sWinZip = "C:\Program Files\WinZip\WinZip32.exe" 'Location of the WinZip program
'sFileToZip = (strLocal_Path & "Outputs\TMA_Draft_Documents\" & vFile_Name & "*")
'sZipFile = (strLocal_Path & "Outputs\TMA_Draft_Documents\" & vFile_Name & ".zip")
'sWinZip = "C:\Program Files\7-Zip\7z.exe" 'Location of the WinZip program
'sFileToZip = (strLocal_Path & "Outputs\TMA_Draft_Documents\" & vFile_Name & "*")
sZipFile = (strLocal_Path & "Outputs\TMA_Draft_Documents\" & vFile_Name & ".7z")

'SetAttr sZipFile, vbNormal
'ShellWait sWinZip & " -a " & sZipFile & " " & sFileToZip, vbHide



'DELETE FILE
'Kill strLocal_Path & "Outputs\TMA_Draft_Documents\" & vFile_Name & ".doc"
'Kill strLocal_Path & "Outputs\TMA_Draft_Documents\" & vFile_Name & "*.xls"
End If
End If
i(2) = i(2) + 1
Call Process_Progress("Converting Final TMAs to PDF.", Round((100 / i(1)) * i(2), 2) & "% Complete.", dTime(1))
Next vitem

End With

WordObj.Quit
Set WordObj = Nothing
'XLObj.Quit
'Set XLObj = Nothing
'Call Kill_Processes
Call PDF_RestoreDefaults
End Function
Function Process_Pause()
Dim i(1 To 2)
i(1) = Now()
i(2) = DateAdd("s", 2, i(1))
Do Until i(1) >= i(2)
i(1) = Now()
Loop
End Function
Function PDF_RestoreDefaults()
Dim objPDF As New PDFCreator.clsPDFCreator
With objPDF
If .cStart("/NoProcessingAtStartup") = False Then
MsgBox "Can't initialize PDFCreator.", vbCritical + _
vbOKOnly, "PrtPDFCreator"
Exit Function
End If

'Set Values
.cOption("UseAutosave") = 0
.cOption("UseAutosaveDirectory") = 1
.cOption("AutosaveDirectory") = "\"
.cOption("AutosaveFilename") = ""
.cOption("AutosaveFormat") = 0
.cOption("UseCreationdate") = vbNullString
.cOption("UseStandardAuthor") = 0
.cOption("PDFUseSecurity") = 0
.cOption("PDFUserPass") = 0
.cOption("PDFUserPassString") = vbNullString
.cOption("PDFOwnerPass") = 1
.cOption("PDFOwnerPassString") = vbNullString
.cOption("PDFEncryptor") = 0
.cOption("PDFDisallowCopy") = 1
.cOption("PDFDisallowPrinting") = 0
.cOption("PDFDisallowModifyContents") = 0
.cOption("PDFDisallowModifyAnnotations") = 0
.cOption("PrinterTempPath") = Environ("Temp") & "\PDFCreator\"

'Save Values
.cSaveOptions
End With
Set objPDF = Nothing
End Function
#10
Or something like this will move the PDF to the corresponding folder of the Doc

With Application.FileSearch
.Filename = "*.doc"
.LookIn = 'Enter Root of word docs'
.SearchSubFolders = true
.Execute
i(1) = .FoundFiles.Count
For Each vitem In .FoundFiles
vFile_Name = Mid(vitem, InStrRev(vitem, "\") + 1, Len(vitem) - (InStrRev(vitem, "\") + 4))

name 'Path of PDF' & "\" & vFile_Name & ".pdf" As Left(vitem,InStrRev(vitem, "\") + 1) & vFile_Name & ".pdf"
Next vitem
#11
H Coddfather & FP.
Wow, that looks like a lot of code, and unfortunately my knowledge of code isn't great to say the least.

I have copied the second piece of code you posted into a notepad file, and renamed it as a bat file.
I inserted the folder names in "" 's, and ran, but it does not seem to function. I don't get any error messages, just a black box flashes up, which makes me think it runs ok. After it has ran, I check the folders, and they contain exactly the same files as to before I ran the bat file. The bat file reads:

With Application.FileSearch
.Filename = "*.doc"
.LookIn = "C:\AA Word Files"
.SearchSubFolders = true
.Execute
i(1) = .FoundFiles.Count
For Each vitem In .FoundFiles
vFile_Name = Mid(vitem, InStrRev(vitem, "\") + 1, Len(vitem) - (InStrRev(vitem, "\") + 4))

name "C:\AA PDF Files" & "\" & vFile_Name & ".pdf" As Left(vitem,InStrRev(vitem, "\") + 1) & vFile_Name & ".pdf"
Next vitem


It refers to 2 directories:
C:\AA Word Files - where my word files are saved (within their sub folders)
C:\AA PDF Files - where my PDF files are saved (without subfolders)

Each Word file has a PDF with the same filename, just with a different file extension.

Any ideas on what I am doing wrong at all?
#12
Thanks again for all of your help.
#13
It's vba code you need to add as a function within vba word xl etc

You also need to declare the variables should start to make sence on phone at mo if you are stuck I will reply when I get home

Edited By: coddfather on Oct 06, 2010 20:11: edited
#14
Sorry coddfather, as I said I am not at all experienced in vba, I have left a few messages with friends to see whether they can help too

I am using Office 2007 at present, I had a quick google, and saw some articles that sat that Application.Filesearch is not supported in this version of Excel? Shall I install my office 200 aswell?

Any further help would be greatly received
#15
Do you have ms access if so let me know and I will check the code so u can cut and paste it straight in
#16
Thanks Coddfather. I have MS Access
#17
Hi,

Paste this straight into a bank module then, use F8 to step through and test

Option Compare Database
Option Explicit

Function Move_PDF()

Dim vitem As Variant
Dim vFile_Name As String

With Application.FileSearch
.Filename = "*.doc"
.LookIn = "C:\AA Word Files"
.SearchSubFolders = True
.Execute
For Each vitem In .FoundFiles

vFile_Name = Mid(vitem, InStrRev(vitem, "\") + 1, Len(vitem) - (InStrRev(vitem, "\") + 4))

Name "C:\AA PDF Files" & "\" & vFile_Name & ".pdf" As Left(vitem, InStrRev(vitem, "\") + 1) & vFile_Name & ".pdf"

Next vitem
End With

End Function
#18
Awesome.
It seems to be getting stuck at:

With Application.FileSearch


I don't know whether this is a Office 2007 problem, so I am installing Office 2000, to see whether it fixes this. I'm quite anticipated as I think you may have cracked it!
#19
I will let you both know how I get on once Office 2000 has finished installing
#20
I think you are missing a reference in the module window under references tick Microsoft dao 3.6 and try that, if that doesn't work just do a quick google to find out what reference you need to have ticked
#21
So sorry about this, I tried your suggestion but it still kept halting at that point.
I have installed Office 2000, and it no longer gets stuck at that point, however the PDF files are still not moving.

I'm pretty sure that I am doing something wrong, so will hopefully be able to ask a friend at work whether he can help, with the code that you provided. I expect it's something quite simple.

I will update you both after I get back from work tomorrow. I really think that it's just a matter of dotting the i's now though.

Thanks for all of your time
#22
My mistake, change the name line to

Name "C:\AA PDF Files" & "\" & vFile_Name & ".pdf" As Left(vitem, InStrRev(vitem, "\")) & vFile_Name & ".pdf"

and that will work, just tested myself and worked
#23
and ms kindly removed this function in 2k7 so the work around was detailed here
935402
hope we don't move to new version, will have a lot of work to do as I bet this isn't the only thing
#24
Sorry I wasn't around yesterday. I spent all day on a PC during office hours & just needed a break when I got home.

Please can I just confirm where we are up to?

Do we have converted ".pdf" files, but they are not being copied (moved)? Or are the ".doc" files not being converted (yet)?

With respect to moving the ".pdf" files; have you tried renaming the two folders to not have space characters?

"C:\AA Word Files" becomes "C:\AA-Word-Files" (in your file structure & in the code)
"C:\AA PDF Files" becomes "C:\AA-PDF-Files" (ditto)

If this then works you could rename back to how you prefer the folders to be named, but then change the code to include quotation marks around both paths/filenames.

Also, if you need a replacement for the Application.FileSearch method for MS-Office (Access) 2007 you can use the Dir() [or Dir$()] function as suggested by Microsoft. This will work, but it quite slow in execution in comparison to using direct calls to the operating system library routines (the MS-Windows Software Development Kit [SDK] Application Programming Interface [API] Dynamic-Link Libraries [DLLs]).

If you cannot get the Dir() suggestion working to your satisfaction, please let me know & I'll provide alternate method for you.

I won’t post the code now as it may confuse or hinder discussions at present, but if speed of execution of the process is important to you, just let me know.

BFN,

fp.
#25
You 2 are both awesome. The code works a treat. Just worked out how long it would have took me to do manually and it was over 50 hours. SO you have saved me an incredible amount of time.

If I can do anything in return please let me know.

Thanks again for all your help, you have made me very happy
#26
I did very little, allsa001, but thanks for your gratitude in any respect.

BFN,

fp.
#27
cutePDF

easy
#28
I use PDF995(.com).

BFN,

fp.

Post a Comment

You don't need an account to leave a comment. Just enter your email address. We'll keep it private.

...OR log in with your social account

...OR comment using your social account

Thanks for your comment! Keep it up!
We just need to have a quick look and it will be live soon.
The community is happy to hear your opinion! Keep contributing!