Diff Microsoft Word (docx) documents

Rob Barrett
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
June 17, 2013

I'm using Sourcetree to manage Microsoft Word documents (docx). I understand git/hg/etc. aren't really designed for handling these binary files.

I'd like to have a more usable diff for these documents. I have scripts that do two different kinds of diff:

1. Textual diff that converts the Word docs into plain text and then does a standard diff on them.

2. Visual diff that controls the Word application to produce a composite document from the two that I'm diff'ing.

Both of these scripts work fine from the command line. Can someone help me configure hg and Sourcetree so that the textual diff appears in the Sourcetree GUI and the visual diff is launched by clicking on the "external diff" button?

I really appreciate it and would be willing to share the scripts for others who are interested.

3 answers

2 votes
npapnet March 30, 2018

I've had this problem from years (previously I used TortoiseHg which had that solved).

I've recently came across this project in github

https://github.com/ForNeVeR/ExtDiff

which solves this problem both in Sourcetree and on the command line. 

Simple steps:

- download and extract to a location

- add in  the project's .gitattributes the following line

    *.docx diff=word

- add in  the global .gitconfig the following lines

   [diff "word"]
   command = <extraction location>/diff-word-wrapper.cmd

 

sunk818
Contributor
April 5, 2018

How has using ExtDiff been for you? Is there a Windows binary for this?

Like Igor likes this
npapnet April 5, 2018

there is no need for the binary. What I described is the procedure I followed in windows. 

The only thing you need to remember is need to create a gitattributes file for every repository.

Jerel McDonald
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
November 1, 2018

Tried this - sourcetree still says it's a binary file.

Boris Molostov
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
September 3, 2019

I did the same. 
But it doesnt work =(


Jerel McDonald
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
September 3, 2019

FYI, I did get this to work. It is awesome.

0 votes
sunk818
Contributor
January 18, 2017

I'm using SourceTree 1.9.10.0 and docx seems to be comparing fine for me on the textual level. It seems to be a recent change because I don't recall previous versions comparing on the text level.

If you paste in images in your Word docx, that is not compared... so the accuracy is limited.

170118-02.jpg

As a side, if I know how to use Markdown with locally referenced images or relative image links, I'd prefer Markdown.

john zajac
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
January 16, 2018

Just be clear, you are in the context of diffing through sourcetree, however just wdiff alone can give us what we need, right? I don't prefer sourcetree as it's slow and msi updates do not preserve previous data very well / settings often get lost. 

sunk818
Contributor
September 3, 2019

SourceTree installer has improved a lot lately and even the SSH integration is much better. So, I suggest you give it a spin. It seems they've really listened to the feedback and made improvements. I love software companies that listen to customer feedback and make noticeable updates.

0 votes
Flo Ledermann
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
October 8, 2014

I have the same problem here - i've set up diffing word docs accroding to the article at http://blog.martinfenner.org/2014/08/25/using-microsoft-word-with-git/ by putting

*.docx diff=pandoc

in my .gitattributes and adding this section in .git/config:

[diff "pandoc"]
  textconv=pandoc --to=markdown
  prompt = false

works fine from the command line, but in SourceTree I get a spinning icon and nothing happens when the diff should be displayed.

sclarke81 September 22, 2015

Did you find a solution to this?

Ansar Rezaei
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 12, 2016

Hi I need this too. I'm new to git and sourcetree, please someone explain it more. I cant find ./gitattributes and git/config.

Chris Frisina
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
January 28, 2016

@Ansar Rezaei: You have to create a file called .gitattributes (the dot helps make it hidden from normal OS views, but is still there) and put it in the PROJECT's highest level along with the .git folder and usually alongside a .gitignore file.  The .gitconfig is at a more global level in your OS.  likely your Home Folder, and may already be there with some other stuff in it.   Just postpend the code.

 

@Rob Barrett: Can you please share the code scripts?  feel free to email me at atlassian@specialorange.org . Thanks!

Hermann Klocker-Mark
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
April 25, 2016

I did follow the link Flo mentioned but although it works from the command line for small files it does not for bigger files - and I get the spinning cursor in sourcetree. Could anyone post a complete solution with all the scripts and files necessary. It would be very useful for many of us (just google for git+word).

Ansar Rezaei
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 25, 2016

I really appreciate it if someone share a complete solution.

 

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events