Powershell to find Text in PDF and Word

In continuation to my previous article on PowerShell to find hyperlink texts in ppt, here I am going to show how to find hyperlinks in Word or PDF files. Sample code below, tweak it for your purpose and drop me a comment if you need help.

$FilePath= "C:\Users\xxx\" 
$OurDocuments = Get-ChildItem -Path "$FilePath" -Filter "*.pdf" -Recurse #change to .doc* for word

$Word = New-Object -ComObject word.application
$Word.Visible = $false
$i = 0

$OurDocuments | ForEach-Object {
try {
    $Document = $Word.Documents.Open($_.FullName,$false,$true) 
    #"Processing file: {0}" -f $Document.FullName
    $Document.Hyperlinks | ForEach-Object {
        if ($_.Address -like "https://domain.com*" -or $_.Text -like "https://domain.com*") 
                "Found issues {0} `r`n" -f $Document.Fullname 
                "Found issues {0} `r`n" -f $_.Address
                "Found issues {0} `r`n" -f $_.Hyperlinks

    }catch{Write-Host "Error has occured while accessing" $Document.FullName}
    catch{Write-Error $Document.FullName}

   #"Completed processing {0} `r`n" -f $Document.Fullname
  Write-Progress -Activity "Searching Hyperlinks" -Status "Progress:" -PercentComplete ($i/$OurDocuments.count*100)



Popular posts from this blog

PowerShell to find HyperLink Text in PowerPoint

Getting Started with MariaDB