RE : How to capture n words between 2 strings in Python?
I have 2 strings, and want to match any anywhere in either text where the 4 or more words match together and then colour the output.
example input:
text1= 'hello the weather is great. hello goodbye hi hiya. happy friday today is great. I like reading books.' text2= 'testing this. hello goodbye hi hiya. happy sunday today is great. I like reading books.'
Desired output (in bold here as I would also like the original string to be displayed)
text1= 'hello the weather is great. **hello goodbye hi hiya.** happy friday today is great. **I like reading books.'** text2= 'testing this. **hello goodbye hi hiya**. happy sunday today is great. **I like reading books.'**
Please could someone share how this can be done?
Thank you
You may want to look for string matching algorithms. To check matching words between two strings, you may need to threat one part of the text1 as key and search it in the text2.
I suggest you to avoid brute force ( Boyer – Moore / Horspool ) algorithms and try to implement KMP Search algoritm, since KMP has logarithmic time complexity in the worst case.
Or, you can use text1 and text2 as whole and shift one of them by 1 and check if there are any consecutive matching words with greater or equal size of 4, then take the index. Repeat untill true or end of the string.