Publisher does not support the Fluid field type. Please do not contact asking when support will be available.

If you purchased an add-on from expressionengine.com, be sure to visit boldminded.com/claim to add the license to your account here on boldminded.com.

Ticket: DeepL Auto-Translate introduces spurious returns translating RTE field with bullets

Status Resolved
Add-on / Version Publisher 3.7.2
Severity
EE Version 6.3.5

Gavin @ JCOGS

Sep 23, 2022

First time tried using the autotranslate.
Added API key and enabled, went to record and triggered the auto translation.
Mostly the entry was translated without issue, but the RTE field (redactor) was translated but inherited spurious returns.

Source text:

<p>Nuestros expertos en Magento, Prestashop, Wordpress y WooCommerce 
están a tu disposición, para que no tengas que preocuparte de problemas 
técnicos y puedas centrarte en tareas más productivas para tu negocio.</p>
<ul>
                        <li> <strong>Te complementamos: </strong>Nuestros
 técnicos son programadores especializados que pueden encargarse del 
soporte a tus clientes y otras tareas técnicas que te interese delegar. 
No pierdas tu valioso tiempo con problemas técnicos, caídas, ataques de 
hackers, etc.</li>
                        <li> <strong>Tú mandas: </strong>Tú
 decides si nuestros técnicos hablan directamente con tus clientes o 
gestionamos todo a través de tu agencia. Nos adaptamos a tus 
instrucciones.</li>
                        <li> <strong>Contacta por la vía que prefieras: </strong>Estamos
 disponibles por email, chat y teléfono, también con servicio de 
emergencias. Todo el apoyo que necesitas para que un problema técnico no
 te amargue el día ni te haga perder clientes.</li>
                    </ul>

The translated version ended up like this:

<p>Our Magento, Prestashop, Wordpress and WooCommerce experts are at your disposal, so you don't have to worry about problems.
are at your disposal, so that you don't have to worry about technical
technical issues and you can focus on more productive tasks for your business.
</p>
<ul>
                        <li> <strong>We complement you:</strong>Our
 technicians are specialized programmers who can take care of customer support and
support to your customers and other technical tasks that you want to delegate.
Do not waste your valuable time with technical problems, crashes, hacker attacks, etc..
hacker attacks, etc.</li> <li>
                        <br></li><li> <strong>You're in charge:</strong>You
 decide if our technicians talk directly with your customers or if we manage everything
we manage everything through your agency. We adapt to your
instructions.</li> <li>
                        <br></li><li> <strong>Contact us by the way you prefer: </strong>We are available by email, chat and phone.
 available by email, chat and phone, also with emergency service.
emergencies. All the support you need so that a technical problem does not make your day or
 make your day bitter or make you lose customers.</li>.
                    </ul>

You will see that the translated version has added
fields.

I have worked with DeepL on similar issues with my own Auto-Translate add-on and if I recall rightly this is an issue within DeepL’s API but can be controlled via the settings used to initiate the translation. However, without diving into your code I am not sure whether similar fixes can work with your setup.

Anyhow - as it stands this is an issue / bug - grateful for any guidance you can give on how it might be resolved.

Currently working on a Laravel Valet local host, so cannot provide remote access to the server - but if this is necessary to resolve let me know and I’ll move it onto a staging server you can access. However, if it is the bug I think it is, you should be able to replicate / its a DeepL issue…

Thanks!

PS - separate point - on Firefox (105) on macOS 12.6 this page renders with black background colour making the location of all content fields impossible to see - need to click around to find the fields. Not sure if this is a bug or style choice, but not helpful.

#1

BoldMinded (Brian)

Comment has been marked private.

#2

Gavin @ JCOGS

Hello. Thanks. Now it doesn’t translate the RTE field content at all (remains in source language, but other fields in entry are translated) Also - just in passing - if you delete the translation and re-open the page Publisher shows you the (presumably) cached content from previous translation, but does not offer a button to re-translate (or to erase the cached content). Simply a yellow box with:

No saved English translation found. Displaying English translation from the DeepL translation service.

Unclear what workflow is to re-do the translation as a result.

#3

BoldMinded (Brian)

On your add-on page it says “The exclusion of HTML tags is part of the service provided by the machine translation service” - are you stripping all html? I tried the tag_handling = ‘html’ as an option and it seems to work a lot better, but it’s not perfect. For example the nested <strong> tags seemed to move a little bit and wrap the wrong words/characters.

#4

Gavin @ JCOGS

In the early days I had many problems with DeepL’s (then) XML parsing service not handling HTML cleanly. I think partly as a result of my badgering them they have improved - the HTML option seems to have been one result of this, though it is not yet perfected. Mostly because it has a mode where it gets sent the whole HTML output (full page translate) my add-on strips out some HTML and all the EE elements before submitting to DeepL - the HTML mostly being tags that are known to contain things that won’t translate well (e.g.<head> and <link> tags). It then submits the translate job using the web API - and within that the relevant options are: ‘split_sentences’ => ‘nonewlines’; ‘outline_detection’ => 0; ‘splitting_tags’ => ‘div’. The add-on has a mode where it submits just text and another where it submits HTML and sets tag_handling option accordingly. Where ever possible it submits just text.

I don’t know if you are using the web API or the php library - it might be that the php library (which emerged some time after I wrote my add-on) has more / fewer options. Not used, so unfortunately cannot comment.

HTH.

#5

BoldMinded (Brian)

Ok, try the build in the next comment. In the Service/Translators/Deepl.php class you’ll see the $translateOptions array, which is

$translateOptions = [
            'preserve_formatting' => true,
            'tag_handling' => 'html',
        ];

Like you mentioned it isn’t perfect, but in my test it did keep the <li>’s intact and didn’t insert any funky spacing or tags. It just moved the <strong> tags a bit to the left or right depending on the word. This might be as good as it can get.

I am using their PHP api client.

#6

BoldMinded (Brian)

Comment has been marked private.

#7

BoldMinded (Brian)

I also tested deleting a translation of an entry will clear any translation cache.

#8

Gavin @ JCOGS

Ace thanks - the updated version seems to work better with HTML than before.

#9

BoldMinded (Brian)

Good to hear. I’ll go ahead and close this ticket out. Let me know if anything else pops up.

Login to reply