Files auto-categorization & language detection
Smart Project helps the PM with the file categorization.
Files with extensions: .pdf, .png, .jpg, .jpeg, .gif, .tmp, .zip, .tar, .gz are always categorized as "Source Documents."
General rules for automatic categorization are as follows:
- Source Document - default fallback category
- Source to Be Prepared - not applicable
- CAT Package - recognized extensions: .sdlppx, .mqout
- CAT Package (Return) - recognized extensions: .sdlrpx, .mqback
- Translated Document - not applicable
- Reference File - not applicable
- Terminology - recognized extensions: .tbx, .sdtbx, .mqtbx
- Translation Memory - recognized extensions: .tmx, .sqtmx, .mqtmx, .sdtmx, .sdltm
- CAT Analysis - ignored extension: .doc and size must be below 1 MB
- Bilingual Document - recognized extensions: .rtf, .sdlxliff, .mqxlz, .mqxliff, .doc, .docx, .zip
- Formatted Document - not applicable
- Segmentation Rules - content must include the following attribute: resourcetype="SegRules"
- Filtering Rules - content must include the following attribute: resourcetype="FilterConfigs"
- QA Report - recognized extensions: .xlsx and size must be below 500 KB
- memoQ Light Resource - content must include the following attribute: resourcetype="*"
- Other - not applicable
Note that .xlsx files larger than 10 MB will be categorized as Source Documents.
Since files can also be attributed with languages in a Smart Project, the system suggests languages by applying the following methods on file upload in both Home and Vendor Portals:
- use of project or quote source/target languages
- use of job languages
- content parsing of XLIFF, TM and TB files
Remote files, ie. files coming from an integrated CAT tool reflect the language information as seen in the third-party software. It applies to the following file categories:
- Bilingual Document
- Translation Memory
- Terminology
- CAT Analysis
Customer support service by UserEcho
Replies 0