Tag Archives: PowerPoint

Updated Events GitHub repository – Convert pptx to pdf

As I have been speaking at a number of events recently I also have been updating my GitHub Events repository. Usually I include a markdown file with a short description, my demos and my slides. I had been uploading my files as .pptx and I noticed that the repository edged over 100 MB. This prompted me into reconsidering this approach, I felt I needed to address the following:

  • Use the most compatible format available, presentations should be viewable on any device
  • Fonts should be correctly represented
  • File size should be minimal

In an effort to more efficiently use the space I have available and to use a more compatible format I decided to convert my presentations to .pdf.

Because I do not like doing stuff manually I decided to use PowerShell in combination with a bit of bash scripting to get my repository updated. First lets take a look what kind of data we are dealing with:

1
2
Get-ChildItem C:\git\Events -File -Filter *pptx -Recurse |
Select-Object -Property FullName

In total 29 presentations uploaded in .pptx format, if I would have to convert these by hand it would take about 30 minutes. Taking a look at what is possible with the PowerPoint.Application Com-Object took about 5 minutes and an additional 5 to put together the following script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Get-ChildItem C:\git\Events -File -Filter *pptx -Recurse |
ForEach-Object -Begin {
    $null = Add-Type -AssemblyName Microsoft.Office.Interop.PowerPoint
    $SaveOption = [Microsoft.Office.Interop.PowerPoint.PpSaveAsFileType]::ppSaveAsPDF
    $PowerPoint = New-Object -ComObject "PowerPoint.Application"
} -Process {
    $Presentation = $PowerPoint.Presentations.Open($_.FullName)
    $PdfNewName  = $_.FullName -replace '\.pptx$','.pdf'
    $Presentation.SaveAs($PdfNewName,$SaveOption)
    $Presentation.Close()
} -End {
    $PowerPoint.Quit()
    Stop-Process -Name POWERPNT -Force
}

This script will recursively look for all .pptx files in the Events repository and the run the following code:

  • In the begin block load the PowerPoint Com-Object and the required type for storing files as .pdf
  • For each presentation, open the presentation, generate a new name and convert it to .pdf
  • Finally at the end close the PowerPoint application and afterwards using Stop-Process to close the window, note that if you had any other PowerPoint windows open they will also be closed.

Now I have both the .pdf and the .pptx stored in the folder, let’s take a look what the difference in file size is:

1
2
3
4
5
6
7
8
9
foreach ($Extension in ('pptx','pdf')) {
    Get-ChildItem C:\git\Events -File -Filter "*$Extension" -Recurse |
    Measure-Object -Property Length -Sum | ForEach-Object {
        [pscustomobject]@{
            'SizeinMB'  = [math]::Round($_.Sum/1MB,2)
            'Extension' = $Extension
        }
    }
}

A nice decrease in size and a format that is more suitable for sharing, this is looking good. After verifying that the .pdf files are looking good we can remove the .pptx files with the following code:

1
2
Get-ChildItem C:\git\Events -File -Filter *pptx -Recurse |
Remove-Item -Force

The last step is to commit everything to GitHub and make it available to everyone. I found a nice Stack Overflow thread that explained how to mass remove files:

Removing multiple files from a Git repo that have already been deleted from disk

Which left me with the following commands to run to commit everything to the repository using bash:

1
2
3
4
git ls-files --deleted -z | xargs -0 git rm 
git add *
git commit -m "Removed pesky pptx and added glorious pdf"
git push origin master

And to view the result here is what is looks like on GitHub now and the commit:

GitHub – JaapBrasser – Events – Commits

Let me know what you think, is .pdf a more useful format over .pptx to share presentations or would you rather see it the other way around?

Share