Retrieve docProps from a .docx file
In SharePoint Online you can assign metadata to documents. However, Office Open XML files can also have document properties. In this article I am showing you how you can retrieve those via Power Automate.
Inspiration
This question from DanielStrid:
I’m trying to get the “File Version” property set for a file in Windows Explorer in a Flow. We have a version handling for documents and would like to have it as Metadata and process files differently in a Power Automate flow depending on version. I can not find the other data, for example “Authors” either.
Power Users Community thread: Get File Property ‘Version Number’ for file in Sharepoint (NOT Sharepoint version).
Zip Files
It is probably useful to know that Office Open XML files like .docx are .zip files with XML and other data files, along with a specification of the relationships between them. One of those folders is docProps which contains the core.xml with the properties. Our goal is to retrieve one or all of those properties from the core.xml.
Unzip Open Office XML document via File explorer pic.twitter.com/t1zqalvYBc
— Dennis (@expiscornovus) March 20, 2022
Properties method approach
I managed to find two approaches to retrieve the document properties. One is a bit (over)complicated and the other one is a bit simpler.
The first approach is the Unzip approach. In this approach we are going to rename the .docx as a zip, extract and retrieve the core.xml from the docProps folder. With some substring functions you are able to find your properties.
The second approach is the Properties method approach. We can use the Properties method from the REST API in a Send an HTTP request action. It is more straightforward and you can even achieve this with one action. So, it would be my suggested approach.
Unzip approach
Like mentioned earlier in this blog, the Properties method approach is probably simpler and easier. However, that does not mean we can have some fun and trying the unzip approach with Power Automate :P.
In addition to that this setup might be useful for other scenarios as well. That is why I still wanted to share it.
1. Add a Manually trigger a flow trigger action.
2. Add a Get file content using path action (from the SharePoint Connector).
This step is to retrieve the content of the file located in a SharePoint Online location.
3. Add a Create file action. Select a location and a filename you want to use. Use the body of the previous action.
This step is to create a new temporary file in OneDrive for Business location.
4. Add a Move or rename a file using path action. Use the Path field of the previous action.
This step is to rename the temporary file to a .zip file.
5. Add a Extract archive to folder action. Use the Path field of the previous action.
This step is to unzip the .zip file.
6. Add a Get file content using path action (from the OneDrive for Business Connector).
This step is retrieve the core.xml content.
7. Add a Compose action. Add the two expressions of the code snippet below in the expression editor.
This step uses split and substring functions to retrieve the correct property values. It also might be possible with xpath, but I have not tried that.
That should be it for the setup.
Happy testing!