Regex issue please help

General discussion related to "Everything".
Post Reply
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Regex issue please help

Post by Debugger »

How do add a page address to all numbers?
-123456_12345
-19,876
12345_98
123456

Replace with:
site.com/-123456_12345
site.com/-19876
site.com/12345_098
site.com/123456

I have a huge list and I do not know how to do it?
This is very important for me!!!
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Regex issue please help

Post by NotNull »

Is your input like

INPUT1.txt

Code: Select all

-123456_12345
-19,876
12345_98
123456
or like:

INPUT2.txt

Code: Select all

bla-123456_12345foo
bla-19,876foo
bla12345_98foo
bla123456foo
?


If INPUT1.txt and just a couple of hundreds/thousands of lines, enter this on a CMD prompt:

Code: Select all

(for /f "delims=" %x in (INPUT1.txt) do @echo site.com/%x) > OUTPUT.txt
If like INIPUT2 and/or 10000+ lines, execute this command:

Code: Select all

SSED.exe s/\([-0-9_,]*\)/site.com\/\1/ INPUT1.txt > OUTPUT.txt
I have SSED on my systems for years; can't find where I downloaded it. If you have trouble finding it, let me know.
But I think any other SED utility will do (this is a basic SED command)

EDIT: Found it: http://sed.sourceforge.net/grabbag/ssed/
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

Where to put a text file - input (what kind of path)?
I want to put any address at the beginning of each line:

https://website/album-XXX_XXX
https://website/albums-XXX
https://website/wall-XXX
https://website/albumsXXX

XXX <== ALWAYS NUMBER

always name:
albums
albums-
album
wall-
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Regex issue please help

Post by NotNull »

Debugger wrote:Where to put a text file - input (what kind of path)?
- Just put the file with the numbers in a folder, let's say C:\MyTest
- Copy SSED.exe also to this folder
- Start CMD.exe
- execute this command to go to the MyTest folder: CD /D C:\MyTest
- type one of the commands mentioned in the previous post.

When you see the prompt again, the command has finished
Your output is in OUTPUT.txt (in that same folder)
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

Not work for me:


C:\Windows\system32>SSED.exe s/\([-0-9_]*\)/site.com\/\1/ CD /D E:\My Test.txt > OUTPUT.txt
SSED.exe: can't read CD: No such file or directory
SSED.exe: can't read /D: No such file or directory
SSED.exe: can't read E:\My: No such file or directory
SSED.exe: can't read Test.txt: No such file or directory



What am I doing wrong? The path exists.

E:\My test\my test.txt

example PATH
Q:/Rysunki ALL!!/http://site.com/album-47_22995651,,,21,7 KB,,,05-05-14 05:34:26,,

Example add:
http://site.com/album
Stamimail
Posts: 1122
Joined: Sat Aug 31, 2013 9:05 pm

Re: Regex issue please help

Post by Stamimail »

Another approach:

Use MS-Word Search and Replace.
π is a paragraph sign.
Enable view hidden (Ctrl+Shift+8) and you will see
-123456_12345π
-19,876π
12345_98π
123456π

π = ^p in the Search&Replace dialog-box.

So, in your case you need to search for π:
^p
and Replace with:
^psite.com/

No regex skills needed.

:!: I don't think this is possible with notepad.
with Notepad++ ?
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Regex issue please help

Post by NotNull »

Debugger wrote: What am I doing wrong?
I have no idea how you got from these instructions:
- Just put the file with the numbers in a folder, let's say C:\MyTest
- Copy SSED.exe also to this folder
- Start CMD.exe
- execute this command to go to the MyTest folder: CD /D C:\MyTest
- type one of the commands mentioned in the previous post.

When you see the prompt again, the command has finished
Your output is in OUTPUT.txt (in that same folder)
(Where command = SSED.exe s/\([-0-9_,]*\)/site.com\/\1/ INPUT1.txt > OUTPUT.txt )

To this:
Debugger wrote: C:\Windows\system32>SSED.exe s/\([-0-9_]*\)/site.com\/\1/ CD /D E:\My Test.txt > OUTPUT.txt
But you need to enclose filenames/folders with spaces in them in "", like "E:\My Path\My file.txt"
therube
Posts: 4977
Joined: Thu Sep 03, 2009 6:48 pm

Re: Regex issue please help

Post by therube »

(I really confused about what the source files look like & just what we're trying to end up with?)
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Regex issue please help

Post by NotNull »

therube wrote:(I really confused about what the source files look like & just what we're trying to end up with?)
That's how these threads go in general: original question is about something relatively straightforward, only to end up with a qustion that is only remotely related to the original one.
We'll get to that point eventually. Patience, my friend, patience ...

To speed things up :D :
@Debugger, can you post 10 lines of your original input file? And what should the exact output be for those lines?
Stamimail
Posts: 1122
Joined: Sat Aug 31, 2013 9:05 pm

Re: Regex issue please help

Post by Stamimail »

That's how these threads go in general: original question is about something relatively straightforward, only to end up with a qustion that is only remotely related to the original one.
We'll get to that point eventually. Patience, my friend, patience ...
I think it's natural, like in the real world.
But you can try next time to make this
To speed things up :D :
@Debugger, can you post 10 lines of your original input file? And what should the exact output be for those lines?
the 2nd post.
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Regex issue please help

Post by NotNull »

Stamimail wrote:
That's how these threads go in general: original question is about something relatively straightforward, only to end up with a qustion that is only remotely related to the original one.
We'll get to that point eventually. Patience, my friend, patience ...
I think it's natural, like in the real world.
But you can try next time to make this
To speed things up :D :
@Debugger, can you post 10 lines of your original input file? And what should the exact output be for those lines?
the 2nd post.
I do not have a problem with how things 'flow'. On the contrary: after the first step/suggestion, other people can jump in and take that to the next level (or suggest a diffrent approach, like you did)
That is what a forum is about (IMO): sharing solutions/knowledge that also helps other people beside the one that asked the question.

Downside is that you have to re-think and rewrite multiple times.


I'll try your approach a next time. Maybe it *does* work better.
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

Y:/ZK 1 Drawings2 /-40010106_187898190,,,72 KB,,,26-03-15 23:58:44,,
Y:/ZK Drawings/-40010106_187898190,,,170 KB,,,26-03-15 23:58:42,,
Z:/ZK 1/-40010106_187898190,,,168 KB,,,26-03-15 23:58:43,,


Replace with:
Y:/ZK 1 Drawings2/http://site.com/album-40010106_187898190,,,72 KB,,,26-03-15 23:58:44,,
Y:/ZK Drawings/http://site.com/album-40010106_187898190,,,170 KB,,,26-03-15 23:58:42,,
Z:/ZK 1/http://site.com/album10106_187898190,,,168 KB,,,26-03-15 23:58:43,,

Drive Letter + Folder + HTTP + other name

U:\PATH.csv
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

The easiest way to replace would be a regular expression, in a powerful word editor, the performance for a million paths would be the fastest.

Y:/ANYFOLDERNAME/SUBFOLDER NAME,,,210 KB,,,26-03-15 23:58:41,,

Replace with:
Y:/ANYFOLDERNAME/https://site.com/album- OR albumSUBFOLDERNAME,,,210 KB,,,26-03-15 23:58:41,,

Just how to add a url between the folder name and the subfolder?
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Regex issue please help

Post by NotNull »

Debugger wrote:Y:/ZK 1 Drawings2 /-40010106_187898190,,,72 KB,,,26-03-15 23:58:44,,
Y:/ZK Drawings/-40010106_187898190,,,170 KB,,,26-03-15 23:58:42,,
Z:/ZK 1/-40010106_187898190,,,168 KB,,,26-03-15 23:58:43,,


Replace with:
Y:/ZK 1 Drawings2/http://site.com/album-40010106_187898190,,,72 KB,,,26-03-15 23:58:44,,
Y:/ZK Drawings/http://site.com/album-40010106_187898190,,,170 KB,,,26-03-15 23:58:42,,
Z:/ZK 1/http://site.com/album10106_187898190,,,168 KB,,,26-03-15 23:58:43,,
First up, some remarks:
Regex is all about pattern. So the more examples, the better.
And it's also about precision:Y:/ZK 1 Drawings2 /-40010106_1.... A foldername ending in a space? And that space is gone in the converted list? :?
But if I have to do with those 3 lines (which will probably go wrong, as an example in your opoeningpost included a "," and that is used here as a separator/delimiter)


Anyhow ... Based on your 3 examples:
  • Put SSED.exe in some folder
  • Save the script as AddURL.cmd in that same folder
  • Drag your inputfile to the AddURL.cmd script
  • If your inputfile was Q:\path\filename.ext, the output will be in Q:\path\filename_out.ext
  • Done
AddURL.cmd

Code: Select all

@"%~dp0\SSED.exe" s/\/\([-0-9_]*,\)/\/http:\/\/site.com\/album\1/ %1 > "%~dpn1_out.%~x1"

Debugger wrote:The easiest way to replace would be a regular expression, in a powerful word editor, the performance for a million paths would be the fastest.
How do you know that?
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

It does not end with a space.
It still does not work, the output file is empty.
Just add a website url to each subfolder's name. It can not be explained any better
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Regex issue please help

Post by NotNull »

Debugger wrote:It does not end with a space.
There is a space between Drawings2 and /. Don't you see it?

It still does not work, the output file is empty.
As you managed before to completely screw up some pretty straightforward instructions, I can only assume you did the same with the current - even simpler- steps.

Let me repeat them:
  • Put SSED.exe in some folder
  • Save the script as AddURL.cmd in that same folder
  • Drag your inputfile to the AddURL.cmd script
  • If your inputfile was Q:\path\filename.ext, the output will be in Q:\path\filename_out.ext
  • Done
I rephrase:
  • Put SSED.exe in some folder
  • Save the script as AddURL.cmd in that same folder
Let's resume that:
SSED.exe and AddURL.cmd have to be in the same folder.
(Just don't think that it will do when SSED.exe is in your PATH somewhere ...)
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

I think I'm doing what you say, but it does not work for me. No positive results.
With regex it would be a thousand times faster, but I do not remember the regex that I got a few years ago from a friend.

https://s15.postimg.cc/c2fjka8nv/Screen ... .58_AM.jpg


There are no spaces at the end, bad copy/paste.

Edit:

In addition, the change must still be taken into account:

album-(\d+)_(\d+)
album(\d+)_(\d+)
tag(\d+)
wall-(\d+)
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Regex issue please help

Post by NotNull »

Debugger wrote: With regex it would be a thousand times faster, but I do not remember the regex that I got a few years ago from a friend.
This *IS* all regex ... SED is a texteditor that uses regex pattern to search for text and replace it with something else.
It is one of the fastest ways to do what you want (if not THE fastest)

I think I'm doing what you say, but it does not work for me. No positive results.
Alright then, Let's try it another way:
  • Extract the attached AddURL.zip to some empty folder
  • In File Explorer, Drag&drop input1.txt to AddURL.cmd

    BTW: content of input1.txt:

    Code: Select all

    Y:/ZK 1 Drawings2/-40010106_187898190,,,72 KB,,,26-03-15 23:58:44,,
    Y:/ZK Drawings/-40010106_187898190,,,170 KB,,,26-03-15 23:58:42,,
    Z:/ZK 1/-40010106_187898190,,,168 KB,,,26-03-15 23:58:43,,
    
    A CMD window will open.
  • Please post the content of that window
  • Press the spacebar to close that CMD window
  • Is there a input1_out.txt in the folder?
  • What are the contents of that file?
In addition, the change must still be taken into account:

album-(\d+)_(\d+)
album(\d+)_(\d+)
tag(\d+)
wall-(\d+)
I'll ignore those for now as they were not in your original 10 examples.
Please create examples in such a way that if those get converted the right way, all others will go well too.
(again: exact input; exact output)


In the meantime I have build a better regex-"query". But if that can be used depnds on your definitive list of examples.
Attachments
AddURL.zip
(42.94 KiB) Downloaded 289 times
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

Your executable file does not work, it shows 1KB of data (empty)
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Regex issue please help

Post by NotNull »

Debugger wrote:Your executable file does not work, it shows 1KB of data (empty)
As a test, I downloaded the zip myself: SSED.exe is OK.

Sometimes virusscanners quarantine a suspicious file, while replacing it with a placeholder text-file with some explanation (what happens when you open the 1Kb SSED.exe in notepad?)

As an alernative, you can replace the 1KB exe with the one you already downloaded and run the tests that way.
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

I have tested it many times, it does not work, I need a log file that analyzes what exactly happens. How to create a log file?
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

Shows only this, nothing more, from empty.


Image
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

I think regex, it would be better, speeds up millions of lines in a text editor (Acceleration)

Drive Letter:\FOLDER\SUBFOLDER\.....
Replace with:
Drive Letter:\FOLDERhttp://site.com/.....


But not work for me. All wrong regex!!!

[E-Z]:\\[A-Za-z0-9_-Unicode Russian]\(\d+_\d+) and OTHER
Replace with:
[E-Z]:\\[A-Za-z0-9_-Unicode Russian]\(\d+_\d+) AND OTHER$2

Very Complex Regex!!!

you can change all character "\" to site.com/XXX and other but it will cause even more damage to the text
Debugger
Posts: 630
Joined: Thu Jan 26, 2017 11:56 am

Re: Regex issue please help

Post by Debugger »

How to remove any character in a text file, in lines, leaving only the letter of the disk and the name of the folder and subfolder, and only the folder itself???
Post Reply