carlsmith/replace.py

carlsmith · 2017-04-05T02:12:12Z

If you ran the following code, output would be "spam FOO BAR FOO BAR spam".

string = "spam foo bar foo bar spam"
substitutions = {"foo": "FOO", "bar": "BAR"}
output = replace(string, substitutions)

ghost · 2018-09-12T14:45:53Z

Thanks! Nice Solution!!!!

thirumalailucifer · 2018-12-26T06:30:36Z

One of the best Method hats off to you !

feldi-to-go · 2019-05-17T12:49:01Z

Hi all!
I am a newly in programming Python, and found your code regarding easy replacing text, using Python.
Now I am facing a problem to use your code, where the substitutions are not pure text like "foo": "Foo", rather using regex to find special pattern.
For example: I have a string with a double tab's, double blank's, certain numbers or text, and those need to replace each of them different. The double tab with a single tab, the double blank with a single blank, all "006" with "007" and so on.
Is it possible, and how is it possible to modify you code to have a dictionary (or list) with regex-pattern and on the other hand, the related replacements?
I would appreciate to get some help on this topic.
Thank you in advance!

ridhitatineni1 · 2019-11-26T02:32:24Z

This is such a cool solution! Thank you @carlsmith !

lvrfrc87 · 2020-02-20T09:47:39Z

+1

FritzPeleke · 2020-08-05T12:34:50Z

hey @carl i have a question. I have a string say 'AGCGTGCGCGGACGCGTCGCGGCGCGCGCGCAC'. I want o substistute the regex 'AC' and 'GT'. However, i need GT and AC to be closer to one another. My string has several AT and GC. But i want to only replace the two occurance of these regex that are closer.

dgr113 · 2020-09-17T10:43:02Z

Thank you much!

This is a slightly edited solution that is a bit more suitable for production:

import re
from dataclasses import dataclass, field
from typing import Any, Mapping, Match, Pattern


@dataclass
class Parser:
    substitutions: Mapping[str, Any]
    patt: 'Pattern' = field(init=False)

    def __post_init__(self):
        self.patt = self.__patt_build()

    def __patt_build(self) -> Pattern:
        substrings = sorted(self.substitutions, key=len, reverse=True)
        regex = re.compile('|'.join(map(re.escape, substrings)))
        return regex

    def _repl_match(self, match: Match) -> str:
        match_group = match.group(0)
        result = self.substitutions[match_group]
        return str(result)

    def replace(self, string: str) -> str:
        return self.patt.sub(self._repl_match, string)



def main():
    context = {'@A': 111, '@B': {'B1': 333}}
    s = "{ 'A': @A, 'B': @B }"

    parser = Parser(context)
    result = parser.replace(s)
    print(result)


if __name__ == '__main__':
    main()

It also supports any (supports casting to string) types in substitutions :)

carlsmith · 2020-09-19T00:49:10Z

@dgr113 - Being totally honest, I haven't studied your code properly, and it uses some stuff I'm not familiar with, so can't offer much in the way of feedback, but it's always nice to have different solutions from different perspectives, in the same language.

Personally, I prefer minimalist solutions that tend to rely on the same, few dozen standard libraries, but that's mainly to do with being lazy, and wanting to support likeminded people.

If you could explain a bit about the advantages of your solution, I'd be interested, and expect other readers would find it useful too. As it stands, it's hard to see the advantage of rewriting a three-line function as a rather complex looking class (where no two method names use the same combination of (presumably meaningful) underscores). The code is well written, so you seem to know what you're doing. I just don't get what it improves really.

dgr113 · 2020-09-19T13:03:56Z

@carlsmith
Firstly
Your code causes an error if there are non-string values in "substitutions". Numbers for example.
Secondly
In edited version, regex pattern (re.compile) is created once after object is created (and can be cached).
In your case - every time this function is called.
This can be quite an expensive operation. As far as I know, this was in an incorrect comparison of Rust and Go based on regular expressions example. In this "comparison" with permanent recompilation of regex pattern object in Rust code, Go managed to slightly overtake Rust.
In corrected testing (with an object created once), situation was reversed.

carlsmith · 2020-09-21T22:28:23Z

Your code causes an error if there are non-string values in "substitutions". Numbers for example.

Non-string values should be an error. I could just call str to convert the args to strings, but it makes more sense to leave that to the user, if that's their intention. I don't want to coerce the wrong types, as that's not part of the function's purpose.

In edited version, regex pattern (re.compile) is created once after object is created (and can be cached).

It would be trivial to wrap my function in a function that takes a regex, and returns my function (now only accepting the substitutions, and enclosing the regex):

def f(regex):
    def myfunc(subs): etc(regex, subs)
    return myfunc

It's two extra lines. I'm just not seeing the need for even a simple class, or the purpose of the wacky method names??

dgr113 · 2020-09-22T10:27:51Z

You have your point of view, and I have my...
The principles of code readability are important to me.
I try to use lambda functions as little as possible, because they degrade performance and do not support typing.
In addition, I prefer to immediately make the functionality more flexible for adding new features in the future. It seems to me that it is more convenient to add a new methods or protocols than to rewrite everything.
I myself don't really like to use OOP unnecessarily, but I think the principle of encapsulation (in methods for example) is the most important.

PS: I only suggested my version, including your solution. With hope that this will help someone, according to OpenSource spirit.
I'm not attacking your programming principles =)

 import re
 def replace(string, substitutions):
     substrings = sorted(substitutions, key=len, reverse=True)
     regex = re.compile('|'.join(map(re.escape, substrings)))
     return regex.sub(lambda match: substitutions[match.group(0)], string)

carlsmith/replace.py

Select an option

No results found

Select an option

No results found

carlsmith commented Apr 5, 2017 •

edited

Loading

Uh oh!

ghost commented Sep 12, 2018

Uh oh!

thirumalailucifer commented Dec 26, 2018

Uh oh!

feldi-to-go commented May 17, 2019

Uh oh!

ridhitatineni1 commented Nov 26, 2019

Uh oh!

lvrfrc87 commented Feb 20, 2020

Uh oh!

FritzPeleke commented Aug 5, 2020

Uh oh!

dgr113 commented Sep 17, 2020

Uh oh!

carlsmith commented Sep 19, 2020

Uh oh!

dgr113 commented Sep 19, 2020

Uh oh!

carlsmith commented Sep 21, 2020 •

edited

Loading

Uh oh!

dgr113 commented Sep 22, 2020 •

edited

Loading

Uh oh!

carlsmith/replace.py

carlsmith commented Apr 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Sep 12, 2018

Uh oh!

thirumalailucifer commented Dec 26, 2018

Uh oh!

feldi-to-go commented May 17, 2019

Uh oh!

ridhitatineni1 commented Nov 26, 2019

Uh oh!

lvrfrc87 commented Feb 20, 2020

Uh oh!

FritzPeleke commented Aug 5, 2020

Uh oh!

dgr113 commented Sep 17, 2020

Uh oh!

carlsmith commented Sep 19, 2020

Uh oh!

dgr113 commented Sep 19, 2020

Uh oh!

carlsmith commented Sep 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dgr113 commented Sep 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carlsmith commented Apr 5, 2017 •

edited

Loading

carlsmith commented Sep 21, 2020 •

edited

Loading

dgr113 commented Sep 22, 2020 •

edited

Loading