8

I need to replace, in a large number of Python files with many function definitions, all occurrences of

def some_func(foo, bar):

with

@jit(parallel=True)
def some_func(foo, bar):

with whatever level of indentation def some_func(foo, bar) has.

Example: I want to replace

def some_func_1(foo, bar):

    def some_func_2(foo, bar):

        def some_func_3(foo, bar):

def some_func_4(foo, bar):

with

@jit(parallel=True)
def some_func_1(foo, bar):

    @jit(parallel=True)
    def some_func_2(foo, bar):

        @jit(parallel=True)
        def some_func_3(foo, bar):

@jit(parallel=True)
def some_func_4(foo, bar):

Motivation: I want to "brute-force accelerate/parallelize" a FDTD simulation package without having to rewrite the entire codebase by making use of numba's automatic parallelization with @jit.

PS.: Any comment/critique of this naive approach of (ab)using @jit is also welcome (e.g. if this wouldn't work at all)!

srhslvmn
  • 239
  • 1
  • 9

4 Answers4

13

This will work for any kind of spaces (white spaces or tabulations) and for any kind of linebreak \n, \r\n, \r.


  • Ctrl+H
  • Find what: ^(\h*)(?=def\b.*(\R))
  • Replace with: $1@jit\(parallel=True\)$2$1
  • TICK Match case
  • TICK Wrap around
  • SELECT Regular expression
  • UNTICK . matches newline
  • Replace all

Explanation:

^           # beginning of line
    (           # group 1
        \h*         # 0 or more horizontal spaces
    )           # end group
    (?=         # positive lookahead, make sure we have after:
        def\b       # literally "def" and a word boundary, in order to not match "default"
        .*          # 0 or more any character but newline
        (\R)        # group 2, any kind of linebreak
    )           # end lookahead

Replacement:

$1                      # content of group 1, the spaces to insert
@jit\(parallel=True\)   # literally
$2                      # content of group 2, the linebreak used in the file
$1                      # content of group 1, the spaces to insert

Screenshot (before):

enter image description here

Screenshot (after):

enter image description here

Toto
  • 17,001
  • 56
  • 30
  • 41
  • Thanks for your solution, could you briefly check comments below and also my suggestion, which (as far as I can tell) works, but is much shorter? Also, you and @Yisroel Tech appear to be using a different group replacement syntax (`$1` vs. `\1`) – srhslvmn Jul 25 '22 at 16:20
  • 1
    @joanna:Your regex `(^.*)(def )` matches 0 or more any character preceding `def`, for example `undef` will match or `blah blah def blah`, not sure you want that. Notepad++ supports both `\1` and `$1`, the former is a legacy notation that should be use in the regex only (it is called backreference), the latter should be used in the replacement part, it allows to capture more than 9 groups. It is used in PCRE and BOOST regex flavour. – Toto Jul 25 '22 at 16:34
  • 1
    Okay, and what about the usage of `\h`? Why not `\s` or simply ` ` (space)? [Wikipedia](https://en.wikipedia.org/wiki/Regular_expression) only mentions `\h` as a notation for *"word head"*, not "horizontal space". Or is it that `*` cannot operate on ` `, but only on `\h`? – srhslvmn Jul 25 '22 at 17:24
  • Furthermore, could you explain the function of `?=`? You call it a "positive lookahead", but it's not clear what it does. Also, is it necessary to have `\b` at the end of `def\b`? Again, wouldn't also a space work, i.e. `def ` (as this is literally what comes after)? – srhslvmn Jul 25 '22 at 17:26
  • ...ah, I think that I can see from my second-last comment above how using ` ` isn't a good idea (SE comment interpreter mangles my input). It seems that whitespaces might not be recognized as characters, so one has to be more explicit and use `\s` or `\h`. But then the question remains, why `\h` and not `\s` – srhslvmn Jul 25 '22 at 17:30
  • Also, I don't understand the meaning of your last three lines in the explanation, i.e. `.* # 0 or more any character but newline` (the "but newline" part), `(\R) # group 2, any kind of linebreak` (**all** of this, `\R` isn't even listed e.g. in Wikipedia, also, shouldn't this be the third group and not second?), and finally `) # end lookahead`. – srhslvmn Jul 25 '22 at 17:36
  • @joanna: `\h` stands for horizontal spaces (i.e. white space or tabbulation), `\s` stands for any spaces including linebreaks. `(?=...)` matches but doesn't consume data, `(?=def\b.*(\R))` matches `def`, word boundary 0 or more any character and linebreak but the cursor stays just before, so in the next step it could be match again, it is not a group. A space after `def` doesn't match a tabulation. You will find clear explanations [here](https://www.regular-expressions.info/) much more better than wikipedia. – Toto Jul 25 '22 at 18:25
  • Ah, so lookahead means reading subsequent characters and resetting the pointer to the start position? So this whole regex thing is like a linear stencil/sieve frame that is moved over a string of characters from beginning to end? – srhslvmn Jul 25 '22 at 18:53
  • Also, why do you need to match a tabulation after `def`? As this is Python syntax, I would always expect something like `def (args, kwargs):`, so never tabs. – srhslvmn Jul 25 '22 at 18:54
  • @joanna: 1) Yes, it is. 2) It's much more for a generic response. I thought we could have a tab between def and function name, couldn't we? – Toto Jul 25 '22 at 19:23
6

A better solution may be using jit_module to jit all your functions automatically

ihall
  • 76
  • 1
  • 1
    It may be a better solution to this particular problem, but long term learning a bit of Regex is a better solution. – jmoreno Jul 26 '22 at 03:47
4

You can do that using Regex capture groups and then reusing the first group (the indentation) on both lines in the replacement.

Search (with Regex):

(^.*)(def .*\([^\(]+\))

And replace with:

\1@jit\(parallel=True\)\r\n\1\2

See in action: enter image description here

Nelson
  • 1,363
  • 10
  • 13
Yisroel Tech
  • 9,687
  • 3
  • 23
  • 37
  • 1
    Hi, thanks for your answer!! Could you please briefly explain what "capture groups" means? I can't really tell from your animation. – srhslvmn Jul 24 '22 at 22:50
  • Also, `(^.*)` looks like a winking smiley. xD – srhslvmn Jul 24 '22 at 22:50
  • Okay, this solution doesn't work, especially since I don't want `(foo, bar)` inserted everywhere. This part obviously needs to stay variable, the same as the function names – srhslvmn Jul 24 '22 at 22:59
  • Ah, using this as the search string works: `(^.*)def ` (with space at the end) – srhslvmn Jul 24 '22 at 23:01
  • ...and using this as the replacement string: `%jit\ndef ` (also with space at the end) – srhslvmn Jul 24 '22 at 23:01
  • The space is important as otherwise strings such as `default` will also get replaced – srhslvmn Jul 24 '22 at 23:02
  • Capture groups are just the two groups in the find both in parentheses (), each of those are considered as groups and then when you use \1 in the replace you insert the first group (which is the indentation only) and when you use \2 you insert the function – Yisroel Tech Jul 24 '22 at 23:02
  • With regards to the foo bar, I'm not a programmer so I didn't know their meaning and thought that they are static for all of them. But indeed your can modify that. – Yisroel Tech Jul 24 '22 at 23:04
  • 1
    I'm a programmer and I've made a more generic RegEx that'll capture anything in the round brackets. – Nelson Jul 25 '22 at 09:10
  • @Nelson: Why do you put an openning parenthesis in the character class? According to me, a closing parenthesis or both should be better and there're no needs to escape inside a character class. – Toto Jul 25 '22 at 11:21
  • Haha, wow. Thanks a lot for all the suggestions. Would you mind briefly explaining what's missing in my own suggestions, which (funnily) also happens to be the shortest? I.e. the search pattern `(^.*)(def )` – srhslvmn Jul 25 '22 at 16:15
  • Your search patterns are all significantly longer. Also, it seems that @Toto and @Yisroel Tech are using a different group replacement syntax, i.e. `$1` vs. `\1`, which I find confusing – srhslvmn Jul 25 '22 at 16:17
  • @joanna multiple regex syntaxes and varying levels of capability in the engines processing them (ex limitations around what you can do with new lines) are just part of the "fun". – Dan Is Fiddling By Firelight Jul 26 '22 at 14:31
-1

In Notepad++, searching for (^.*)(def ) (with space at the end) and replacing with \1@jit\(parallel=True\)\r\n\1\2 works. The space at the end is important as otherwise strings such as default will also get replaced.

srhslvmn
  • 239
  • 1
  • 9
  • 1
    Since `.*` matches any character, this will also match any lines with words ending in "def" for example `var mydef = 5` would be matched and replaced. – Falco Jul 26 '22 at 14:01