The Document Structure Analysis extracts different sections of a given document with markup content (which includes formatted documents such as PDF or Microsoft Word files), including the title, headings, abstract and parts of an email.
This process, even though it takes into account some language markers, is based mainly in the markup of the document, so it can be applied to documents in any language.
Do you have any questions? Have you detected a bug? Contact us through our feedback section or at email@example.com