awk or sed command to replace line break plus text containing spaces

Question

An answer to another question suggests sed -i 's/original/replacement/g' file.txt to replace specific words in a text file. My starting situation looks like this:

        Item: PRF
        Type: File
        Item: AOX
        Type: Folder
        Item: DD4
        Type: File

My ending situation should look like this:

        Item: PRF^Type: File
        Item: AOX^Type: Folder
        Item: DD4^Type: File

Notes: (1) The Ask Ubuntu interface seems to suppress some of the leading spaces before Item: and Type:. There are in fact eight leading spaces. (2) I may have erred in using simplistic examples of Item. The items are actually partial Windows paths (lacking e.g., D:), some of which are quite long. A more accurate example would be Item: Folder\Some Folder\A file name.txt.

I've tried this, with and without double quotes:

sed -i 's/\n"        Type: "/\^"Type: "/g' file.txt

That gives me no errors, but also no changes. Also tried this:

awk '/ "        Item: " / { printf "%s", $0"^" } / "        Type: " / { gsub(/^[ \t]+/,"",$0); print $0 }' source.txt

I tried that to verify that I would be changing only those entries with eight blank spaces before "Item." That didn't work. Trying it with no spaces and no double quotes, as in the answer (below), also failed. Trying it with gawk -i inplace produced source.txt containing zero bytes.

My title initially specified sed. An answer proposing awk alerted me to that alternative, which (now that I'm looking at it) seems more capable. But I cannot figure out how to make it work.

steeldriver · Answer 1 · 2023-01-17T00:21:20.910

By default, sed only loads one line at a time into its pattern space. You can use the N command to load another line.

In fact, your question is a variant of a well-known "one-liner" for joining lines based on the initial character(s) of the following line¹:

40. Append a line to the previous if it starts with an equal sign "=".
sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'

So given

$ cat file.txt
    Item: PRF
    Type: File
    Item: AOX
    Type: Folder
    Item: DD4
    Type: File

(which has 4 initial spaces), then

$ sed -E -e :a -e '$!N;s/\n {4}Type: (File|Folder)/^Type: \1/; ta' -e 'P;D' file.txt
    Item: PRF^Type: File
    Item: AOX^Type: Folder
    Item: DD4^Type: File

Add -i or -i.bak to edit the file in place once you are happy that it is doing the right thing.

Alternatively, you could use the following non-streaming ed editor script to match the Type: lines, substitute ^ for the leading spaces, then join to the preceding line, writing the result back to the same file:

g/^ \{4\}Type:/s//^Type:/\
-1,.j
wq

You can implement that as a non-interactive shell one-liner:

printf '%s\n' 'g/^ \{4\}Type:/s//^Type:/\' '-1,.j' 'wq' | ed -s file.txt

See The GNU ed line editor for details.

see for example Sed One-Liners Explained, Part I: File Spacing, Numbering and Text Conversion and Substitution

Raffa · Accepted Answer · 2023-01-16T14:03:43.470

I would use awk … It is a straightforward one-liner like so:

awk '/Item:/ { printf "%s", $0"^" } /Type:/ { gsub(/^[ \t]+/,"",$0); print $0 }' file

That is … If the line has Item: in it, then print it without appending a newline(printf doesn't append a newline by default) but append the ^ character at the end … and if the line has Type: in it, then remove all leading space and print it appending a newline(print appends a newline by default).

The above command will not modify the original file but, will rather output modified text in the terminal.

To edit the original file in-place, use the -i inplace option of GNU awk(Might be the default on Ubuntu ... Check with awk -W version) or if not, you can install gawk then use it like so:

gawk -i inplace '/Item:/ { printf "%s", $0"^" } /Type:/ { gsub(/^[ \t]+/,"",$0); print $0 }' file

score 0 · Answer 3 · answered May 23 '24 at 10:29

0

It is easier to use paste command for this:

$ cat file.txt 
Item: PRF
Type: File
Item: AOX
Type: Folder
Item: DD4
Type: File
$ paste -d'^' - - <file.txt 
Item: PRF^Type: File
Item: AOX^Type: Folder
Item: DD4^Type: File

answered May 23 '24 at 10:29

userene

276

awk or sed command to replace line break plus text containing spaces

3 Answers3