Single Regular Expression to identify words with alphabetised characters

Ed Allen eallen at kc.rr.com
Tue Nov 5 06:18:53 CST 2002


Forwarded message:
>Subject: Re: Single Regular Expression to identify words with alphabetised characters
>Date: Mon, 4 Nov 2002 16:34:05 -0600
>
>Why not just do the following:
>
>sort -c filename

    If you reread the subject line slowly and think about why I chose
    those particular words you might begin to understand why 'sort'
    which would alphabetise the words differs from the 'grep' which
    only prints lines whose characters are in alphabetic order.

    Not every word will be printed by the 'grep'.  For example 'dandy',
    a perfectly fine and valid word, will not be matched because 'a'
    should come before 'd' in words matching that expression.

    While the 'grep' command will not care that 'oops' comes before
    'most', both of which will match, unlike 'sort'.

>
>If sort returns nothing, the words are sorted.  If the words are not sorted, 
>sort will return something along the lines of:
>
>sort: filename:15: disorder: abbs
>
>You can also do the following if you just want to know if the words are 
>sorted.
>
>sort -c filename 2>/dev/null && echo Already Sorted || echo Not Sorted
>
>I don't know the specifics of why grep is being used or the problem, but 
>damn that is overkill just to find out if a list of words are sorted.
>
>'man sort' will give you are the details for using sort.
>
    Your discussion of 'sort' ignores the pattern description ability of
    Regular Expressions which is what I was trying to highlight.

    Linux comes with a words list containing some 45,000 entries.
    Suppose you show us how to use 'sort' to find all the ones with
    the vowels in acsending order ?
    
    Or you can show a "better" version of this...

        grep
        '^[^aeiouy]*[aeiouy]*[^aeiouy]*[aeiouy]*[^aeiouy]*[aeiouy]*[^aeiuoy]*[aeiouy]*
        [^aeiouy]*[aeiouy]*[^aeiouy]*[aeiouy]*[^aeiouy]*[aeiouy]*[^aeiuoy]*[aeiouy]*[^aeiouy]*$'

            (Or explain how it is intended to work)

        Had to wrap that so it did not mess up your screen.

    Regular Expressions are meant for selecting/matching patterns in
    text.  That is one of the powerful parts which has kept Unix systems
    still being actively developed after thirty years.

    Some examples can awaken an appreciation for what we have.




More information about the Kclug mailing list