Format race 2018: DOCX over DOC and ODT?

Hello everyone!

Today we are going to introduce you to DOCX, one of our core formats and probably the most used format ONLYOFFICE editors work with. Read this post to learn what is DOCX, how it differs from the rival formats and why we chose to work with it in our solutions.

Using DOCX in ONLYOFFICE

DOCX is a Office Open XML format of text files that can contain text, images, objects, formatting and text styles zipped in an archive. Aiming at maximum applicability of the documents produced with our solutions, we took OOXML formats as a core for our suite. Let us explain what we mean by applicability, starting with the important comparison.

The difference between DOC and DOCX

First of all, it is about the form. Microsoft was forced to support the idea of open formats, introducing DOCX as a default format for Office instead of fully proprietary DOC.

DOCX is standardized: it uses ECMA-376, ISO/IEC 295000 Transitional and ISO/IEC 29500 Strict standards. This lets third-party developers create applications that work with it using XML libraries. This, as well as the broader range of supported features and compactness, made DOCX popular.

Anatomically, DOC is a binary format, and DOCX is XML-based. It is an archive that stores separate units of the document. It can be opened as a ZIP archive by simply changing the DOCX extension to ZIP.

However, the accessibility is not the only benefit of DOCX.

Let the numbers speak: how popular are DOCX files?

Using Google search by filetype, we have studied how many files are stored around the web in DOCX, DOC and ODT, and how these figures alter over time.

Why these file formats? DOC is an ancestor of DOCX and is still widely used despite its technical disadvantages and attempts of Microsoft to decline DOC usage. At the same time, ODT is a direct ‘competitor’ of DOCX, argumented for by the ideas of respecting open source and fighting private proprietary software (find more about ODF formats in our recent post).

Let’s take a look at how many files with the name containing “1” in each format Google could find in different time periods, using filetype search:

Filetype search

From the figures above you can see that the majority of the documents are probably still kept in DOC. Total number of DOC files in the search results is more than twice bigger than DOCX, however, in the current pace of spreading, DOCX obviously outcompetes DOС. At the same time, ODT shows to be by far the least popular among the three.

Using the same inquiries, let’s see how many files in each format were saved every year starting from 2013 (counting in thousands of documents):

Format usage trends

Since the amount of ODT files is significantly smaller than that of DOC and DOCX, the trend is barely recognizable on the graph. Here’s a separate graph for ODT:

ODT usage trends

You can see the consistency in the growth of ODT files usage as well. However, the general amount of documents in use increases with time, so it is better to compare this trend with those of others. The number of ODT files created in the past year is only 2.3 times bigger than annual amount five years ago, while in case of DOCX the growth is 10.6 times.

The reasons why we prefer DOCX

It is extremely more popular than ODT. When creating ONLYOFFICE, we decided that applicability is the most important aspect in such a routine tool as the electronic document.

It is more capable than DOC. Additionally, DOCX is clearly expelling DOC from the document flow, both naturally and by design.

It is an open format. This basically means that software developers can create applications working with DOCX thanks to its standardized nature. It is a win both in the expected growth of its popularity and in the global software liberalization trend.

It eases the transition from MS Office to open-source software. DOCX, among other OOXML formats, supports open standards that developers can adopt, unlike the fully proprietary DOC that can be used only in MS Office.

It is a default format for MS Office since version 2007. Tendency to use DOC mainly comes from the users who haven’t upgraded to newer versions of Office yet. Number of those is decreasing.

If you have any questions or would like to add something, feel free to comment below.

 
Comments (3)
  1. Italo Vignoli - Reply
    September 12, 2018 at 11:17 am

    OOXML is a de facto proprietary document format, which was approved as standard because it was supposed to become standard based on the OOXML Strict definition (which was never deployed in any version of MS Office).
    Today, the non standard version and the Strict version are identical, and completely non standard (as they change without any versioning warning, include bynary blobs and non standard bits such as representation of dates, names of colours, names of linguistic versions, etcetera).
    Of course, you are free to choose the document format for your software, but you should avoid promoting it as standard.

  2. Joe Ferrucci - Reply
    October 22, 2018 at 9:39 am

    One major difference that I’ve noticed with the Microsoft vs odt is that when you do a file contents search in Windows, I’m using 10, Windows doesn’t look inside odt files. It will look inside doc, docx and pdf files but not odt.

  3. Alejandro - Reply
    October 23, 2018 at 2:34 am

    Well, “more used” differs from being “more popular”, considering a vendor lock-in like Microsoft… otherwise, in practice OOXML is not an open format and it is not easy to transition from MS Office to open source software. it seems a very fake article.

Add a comment