Wie kann man den ersten Buchstaben des ersten Wortes in jedem Satz mit einem Shell-Skript in Großbuchstaben schreiben?


5

Ich versuche, jeden ersten Buchstaben des ersten Wortes in jedem Satz aus einer txt-Datei mit dem Namen input.txt groß zu schreiben, und ich möchte, dass diese Eingabedatei ein Argument des Shell-Skripts

ist
 ./script.sh input.txt

Beispiel-Eingabedatei:

i am Andrew. you are Jhon. here we are, forever.

Ergebnisdatei:

I am Andrew. You are Jhon. Here we are, forever.

Ein Sonderfall. Was ist, wenn unser Text (im Zusammenhang mit der Antwort von @RaduRadeanu)

ist?
i am andrew. you
are jhon. here we are
forever

das Ergebnis wäre:

I am andrew. You
Are jhon. Here we are
Forever.

Es konvertiert also jedes erste Wort jedes Satzes und auch jedes erste Wort einer neuen Zeile in Großbuchstaben.Wie überspringen wir das erste Wort der neuen Zeile in Großbuchstaben?

Das richtige Ergebnis muss also sein:

I am andrew. You
are jhon. Here we are
forever.

Was ist, wenn der Satz mit "?"oder "!"???

4

sed command is very powerful to edit files from shell scripts. With its help you can edit however you want a text file. These being said, the following script can do what you wish:

#!/bin/bash

#check if a file is given as argument
if [ $# -ne 1 ];then
  echo "Usage: `basename $0` FILE NAME"
  exit 1
fi

sed -i 's/^\s*./\U&\E/g' [email protected]         #capitalize first letter from a paragraf/new line
sed -i 's/[\.!?]\s*./\U&\E/g' [email protected]    #capitalize all letters that follow a dot, ? or !

For your special case, things became slightly:

#!/bin/bash

#check if a file is given as argument
if [ $# -ne 1 ];then
  echo "Usage: `basename $0` FILE NAME"
  exit 1
fi

sed -i '1s/^\s*./\U&\E/g' [email protected]  #capitalize first letter from the file
sed -i 's/\.\s*./\U&\E/g' [email protected]  #capitalize all letters that follow a dot

#check if the a line ends in dot, ? or ! character and 
#if yes capitalize first letter from the next line
next_line=0
cat [email protected] | while read line ;do
  next_line=$[$next_line+1]
  lastchr=${line#${line%?}}
  if [ "$lastchr" = "." ] || [ "$lastchr" = "!" ] || [ "$lastchr" = "?" ]; then
    sed -i "$[$next_line+1]s/^\s*./\U&\E/g" [email protected]
  fi
done

Also, you can consult this tutorial: Unix - Regular Expressions with SED to see how to work in these situations.


4

How about using bash's builtin 'read' function with the period character as delimiter to read each whole sentence into a variable, and then capitalizing the initial character of the variable? Something like

$ cat myfile
i am andrew. you
are jhon. here we are
forever.

$ while read -rd\. sntc; do printf "%s. " "${sntc^}"; done < myfile; printf "\n"
I am andrew. You
are jhon. Here we are
forever.

To handle multiple sentence terminators e.g. ? and ! as well as the regular period, here is a different approach using 'awk' - note that the RT variable that allows us to recover the particular record terminator which matched a particular sentence is an extension that may not be available in all varieties of 'awk'

$ cat myfile
i am andrew? you
are jhon. here we are
forever!

$ awk 'BEGIN{RS="[.!?]+[ \t\n]*"}; {sub(".", substr(toupper($0), 1,1), $0); printf ("%s%s", $0, RT)}' myfile
I am andrew? You
are jhon. Here we are
forever!

Note that the record separator regex above will handle multiple consecutive delimiters ('!?!!!') and optional trailing spaces - which the read-based version doesn't.

As a further enhancement, let's try to add rudimentary handling of quoted sentences by modifying the RS regex once more and changing the sub so that it upper-cases the first non-quote character:

awk 'BEGIN{RS="[.!?]+[\"'\'']?[ \t\n]*"}; {match($0, "[^\"'\'']"); sub("[^\"'\'']", substr(toupper($0),RSTART,1), $0); printf ("%s%s", $0, RT)}'

e.g.

$ cat myfile
i am andrew.    "are
you jhon?"  'here we are
forever!?'

$ awk 'BEGIN{RS="[.!?]+[\"'\'']?[ \t\n]*"}; {match($0, "[^\"'\'']"); sub("[^\"'\'']", substr(toupper($0),RSTART,1), $0); printf ("%s%s", $0, RT)}' myfile
I am andrew.    "Are
you jhon?"  'Here we are
forever!?'