Using regex to bulk find and replace text with superscript
Thread poster: Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 22:21
English to Spanish
+ ...
Jul 3, 2022

Hello there, colleagues:

I have a significant number of segments where I need to replace numbers in pairs with those same numbers in superscript (in this particular case, numbers in scientific notation, like 7.00E+02 or 7.00E-02 (meaning, 7 × 102 or 7 × 10−2) (2 o −2 in superscript) (English into Spanish)

Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2})
... See more
Hello there, colleagues:

I have a significant number of segments where I need to replace numbers in pairs with those same numbers in superscript (in this particular case, numbers in scientific notation, like 7.00E+02 or 7.00E-02 (meaning, 7 × 102 or 7 × 10−2) (2 o −2 in superscript) (English into Spanish)

Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2}) needs to be set to a superscript font (I won't be exporting the document, just sending the bilingual, so I won't be doing it in Word). I can, of course, settle for the power sign (^) and be done with it, but I'd like to know if that is possible through regex.

Kind regards and appreciate your help.
Collapse


 
James Plastow
James Plastow  Identity Verified
United Kingdom
Local time: 05:21
Member (2020)
Japanese to English
notepad++ Jul 3, 2022

I once had a job like this and did it by batch replacing the tags in the sdlxliff file directly in notepad++. If you are careful it is easy enough but be sure to make a backup as you will corrupt the file if you make any mistake.

 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 22:21
English to Spanish
+ ...
TOPIC STARTER
No tags in this case, I'm afraid Jul 3, 2022

Hello, James:

I would try that option, but there are no tags in this case. I'm wondering if I could create superscript tags. Is that possible?


 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 05:21
Member (2014)
Japanese to English
Unicode points? Jul 3, 2022

Rodrigo Rosales Sosa wrote:
Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2})

Superscript minus, superscript two, and many similar symbols exist as unicode points. So you might be able to get this kind of thing: ⁻². I would use this site to look up the code points (just type in "superscript"), then it's just a case of working out how MemoQ handles unicode in its regex engine. If it uses the .NET flavour of regexes it would probably be something like \u207B for superscript minus.

Dan


Rodrigo Rosales Sosa
 
James Plastow
James Plastow  Identity Verified
United Kingdom
Local time: 05:21
Member (2020)
Japanese to English
tags Jul 3, 2022

Rodrigo Rosales Sosa wrote:

Hello, James:

I would try that option, but there are no tags in this case. I'm wondering if I could create superscript tags. Is that possible?



Hi Rodrigo,

Are you working in Trados? If you are, try opening the xliff in Notepad++ and see what is there. (it helps to install an XML plugin so you can see the text more clearly). There should be tags where there is a superscript. You can batch find and replace these to the other elements you want to make superscript.

Dan's solution sounds quicker though.

[Edited at 2022-07-03 19:51 GMT]


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 07:21
English to Russian
A series of replacements Jul 3, 2022

You have to run a number of replacements.
Begin with E-:
1. Replace E-(\d+) with ×10@@-$1
2. Replace @@-01 with ⁻¹
3. Replace @@-02 with ⁻²
4. Replace @@-03 with ⁻³
5. Replace @@-04 with ⁻⁴
etc.

Then
1. Replace E\+(\d+) with ×10@@$1
2. Replace @@01 with blank field
3. Replace @@02 with ²
4. Replace @@03 with ³
5. Replace @@04 with ⁴
etc.

[Edited at 2022-07-03 21:20 GMT]


Rodrigo Rosales Sosa
 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 22:21
English to Spanish
+ ...
TOPIC STARTER
I'll try it out and report back. Jul 4, 2022

Stepan Konev wrote:

You have to run a number of replacements.
Begin with E-:
1. Replace E-(\d+) with ×10@@-$1
2. Replace @@-01 with ⁻¹
3. Replace @@-02 with ⁻²
4. Replace @@-03 with ⁻³
5. Replace @@-04 with ⁻⁴
etc.

Then
1. Replace E\+(\d+) with ×10@@$1
2. Replace @@01 with blank field
3. Replace @@02 with ²
4. Replace @@03 with ³
5. Replace @@04 with ⁴
etc.

[Edited at 2022-07-03 21:20 GMT]


Why didn't I think about copying the numbers already in superscript? I'll report back later. Thank you


 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 22:21
English to Spanish
+ ...
TOPIC STARTER
I'll check it out Jul 4, 2022

Dan Lucas wrote:

Rodrigo Rosales Sosa wrote:
Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2})

Superscript minus, superscript two, and many similar symbols exist as unicode points. So you might be able to get this kind of thing: ⁻². I would use this site to look up the code points (just type in "superscript"), then it's just a case of working out how MemoQ handles unicode in its regex engine. If it uses the .NET flavour of regexes it would probably be something like \u207B for superscript minus.

Dan


Thank you, Dan. I'll check this option out and report back later


 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 22:21
English to Spanish
+ ...
TOPIC STARTER
Solved it Jul 5, 2022

Hello there:

I managed to solve the issue by finding the unicode characters for each superscript number and the minus operator sign (−) and running a series of replacements starting from 1 and voilà. (link to screenshot for future reference: https://imgur.com/a/hVVlbPU).

Thank you for your suggestions and help.


 
Rodrigo Rosales Sosa
Rodrigo Rosales Sosa
Mexico
Local time: 22:21
English to Spanish
+ ...
TOPIC STARTER
Sorry, I should've mentioned Jul 5, 2022

James Plastow wrote:

Rodrigo Rosales Sosa wrote:

Hello, James:

I would try that option, but there are no tags in this case. I'm wondering if I could create superscript tags. Is that possible?



Hi Rodrigo,

Are you working in Trados? If you are, try opening the xliff in Notepad++ and see what is there. (it helps to install an XML plugin so you can see the text more clearly). There should be tags where there is a superscript. You can batch find and replace these to the other elements you want to make superscript.

Dan's solution sounds quicker though.

[Edited at 2022-07-03 19:51 GMT]


I should've mentioned it earlier: I'm working in memoQ. I did try Dan's solution and it worked. Thank you


Dan Lucas
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Using regex to bulk find and replace text with superscript






Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »