Announcement

Collapse
No announcement yet.

Extracting data from text

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting data from text

    Hi all, new to batch scripting, but need help. trying to extract data from a text file, that has recurring instances. Below is a sample of the input data. Now I'm assuming that there will be some element of iteration involved, as there is about 50 samples of the blow data in every txt file






    NAME __________________________________________________ DATE _____________________
    YC0402 R002 CYCOAA05 CONSIGNMENT HANDOVER REPORT - BATCH DETAIL 28 FEB 12 07:26 PAGE 2
    Requested By W179VDL
    COMPANY WW EF2
    From Date 27 FEB 12 To Date 27 FEB 12 Requestor CHERILY Year/Week(s) 1235
    Depot EF2 ELANDSFONTEIN Billing Office JNB OLIVER THAMBO INTERNATIONAL
    Batched Date 27 FEB 12 Batch Number EF2 2829
    Seq Connote Nr TOP Remarks Seq Connote Nr TOP Remarks
    -----------------------------------------------------------------------------------------------------------------------

    01 261984326 S ______________________________ 21 565814056 S ______________________________
    02 261984330 S ______________________________ 22 565814158 S ______________________________
    03 261984365 S ______________________________ 23 565814175 S ______________________________
    04 261984374 S ______________________________ 24 565814325 S ______________________________
    05 261984391 S ______________________________ 25 565814348 S ______________________________
    06 261984405 S ______________________________ 26 565814351 S ______________________________
    07 261984414 S ______________________________ 27 565814365 S ______________________________
    08 261984428 S ______________________________ 28 565814379 S ______________________________
    09 261984431 S ______________________________ 29 565814382 S ______________________________
    10 261984445 S ______________________________ 30 565814498 S ______________________________
    11 261984459 S ______________________________ 31 565814515 S ______________________________
    12 261984462 S ______________________________ 32 659920870 S ______________________________
    13 291961187 S ______________________________ 33 937745280 S ______________________________
    14 291961258 R ______________________________ 34 937745364 S ______________________________
    15 502345561 S ______________________________ 35 937745435 S ______________________________
    16 565813705 S ______________________________ 36 937745483 S ______________________________
    17 565813736 S ______________________________ 37 937745497 S ______________________________
    18 565813740 S ______________________________ 38 972348846 S ______________________________
    19 565813991 S ______________________________ 39 972348850 S ______________________________
    20 565814042 S ______________________________ 40 972348863 S ______________________________






    The output needs to be in the below format

    28 FEB 12 07:26
    Requested By W179VDL


    Batch Number EF2 2829

    01 261984326 21 565814056
    02 261984330 22 565814158
    03 261984365 23 565814175
    04 261984374 24 565814325
    05 261984391 25 565814348
    06 261984405 26 565814351
    07 261984414 27 565814365
    08 261984428 28 565814379
    09 261984431 29 565814382
    10 261984445 30 565814498
    11 261984459 31 565814515
    12 261984462 32 659920870
    13 291961187 33 937745280
    14 291961258 34 937745364
    15 502345561 35 937745435
    16 565813705 36 937745483
    17 565813736 37 937745497
    18 565813740 38 972348846
    19 565813991 39 972348850
    20 565814042 40 972348863

  • #2
    Re: Extracting data from text

    Ultimately it sounds like you're doing a lot of word processing. You might want to try using VB for Applications code in MS Word. When you record a macro in Word, etc., the recorder is writing lines of VBA code in the background.

    Record the mechanical steps you want repeated, decide on the logic you want to use to find and then change the text you're after, and wrap that logic around the steps you recorded.

    Tip: once you get it working how you want, add a line near the beginning to turn off screen refresh, and then turn refresh back on as one of the final lines when you're done--this speeds up the code execution immensely, since Word won't have to wait for a screen re-draw every time it executes a line of code.
    *RicklesP*
    MSCA (2003/XP), Security+, CCNA

    ** Remember: credit where credit is due, and reputation points as appropriate **

    Comment


    • #3
      Re: Extracting data from text

      Have you tried importing this into Excel?

      You have a set of rows that are obviously unrequired, so you would start off by opening int in Excel as a text file, then record a new macro that deletes the unrequired rows.

      Next you would run the [Text to columns] option under the data tab.

      Set up the import as fixed text and move data to new tab, excluding the columns that are unrequired.

      Save the new tab as a text file that is fixed length as well and you will have the results that you're looking for.

      Save the macro into a macro enabled worksheet that you can refine into a process that wil open the first text file, process it and spit out the final text file you're looking for.

      Comment


      • #4
        Re: Extracting data from text

        Its easy changing the part below the line "------------", Just specify the range where the specific changes should happen, and let it do the changes. The specific changes are; remove all alphabetic characters and punctuation marks from the block. Then remove any double spaces.
        Easily done with SED (Gnu Sed) commands.

        For converting the text part above the line "------------" you need to be able to specify and recognize strings of text (patterns) for/on each line to be able to remove what you don't need. If that is possible, then you also can use SED commands for editing that part of the text file.

        You have to play around with SED for yourself, finding the best way for converting your files. Here is an example to get you started (if you have a large number of sed commands, it is better though to put them into a file and use the -f switch. The batch below is defining serveral variables instead, to keep the command lines short)

        Code:
        :: http://forums.petri.com/showthread.php?t=59247
        
        @echo off
        :: for this sample SED.exe and its 3 dll files assumed to be in the batch's folder.
        Set "SED=%~0\..\SED.exe"
        
        
        Set  "InputFile=in.txt"
        Set "OutputFile=out.txt"
        
        
        :: Regular expressions,
        
           :: range 'from - to' (vice versa)
        set "p2=^[ \t]*-*[ \t]*$"
        set "p1=^[ \t]*NAME _.*_*[ \t]*$"
        
        :: SED expressions 2nd block (the wanna be Nummeric-only part)
        Set "expr1=/%p2%/,/%p1%/ {"
        Set "expr1=%expr1% /%p2%/b; /%p1%/b;"
        Set "expr1=%expr1% s/[[:alpha:]]\|[[:punct:]]//g;"
        Set "expr1=%expr1% s/\s\s*/ /g }"
        
        
        :: SED expressions on 1st block (the change-Text part)
        
           :: find lines that matching.., 
        set "p3=BATCH DETAIL"
        set "p4=PAGE"
        set "p5=Requested By"
        set "p6=Batch Number"
        
        Set "expr2=/%p1%/,/%p2%/ {"
        Set "expr2=%expr2% /%p3%\|%p5%\|%p6%/!d;"
        Set "expr2=%expr2% s/^.*%p3%[ \t]*//gI;"
        Set "expr2=%expr2% s/[ \t]*%p4%[ \t].*$//gI;"
        Set "expr2=%expr2% s/^.*%p6%/\n\n%p6%/gI }"
        
        
        :: Run SED
           :: a 2 steps process,
        Set "wrkfile=%temp%.\$.sed"
        
        "%SED%" -e "%expr1%" "%InputFile%" > "%wrkfile%"
        "%SED%" -e "%expr2%" "%wrkfile%"   >"%OutputFile%"
        /Rems

        (Similar posts
        http://forums.petri.com/showthread.p...211#post212211
        http://forums.petri.com/showthread.p...225#post253225
        http://forums.petri.com/showthread.p...230#post253230
        http://forums.petri.com/showthread.p...802#post253802

        http://forums.petri.com/showthread.p...052#post255052
        )
        Last edited by Rems; 9th March 2012, 20:04.

        This posting is provided "AS IS" with no warranties, and confers no rights.

        __________________

        ** Remember to give credit where credit's due **
        and leave Reputation Points for meaningful posts

        Comment


        • #5
          Re: Extracting data from text

          Take a look at this excel spreadsheet.

          Copy your data into the first sheet, run the only macro

          You have the data you're looking for.

          Press F11 and look at the code generated by the Record Macro option and which I edited slightly.

          Gives you some ideas I'm sure.
          Attached Files

          Comment

          Working...
          X