Announcement

Collapse
No announcement yet.

Delete all occurrences of file name inside text file. Possible?

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Delete all occurrences of file name inside text file. Possible?

    Hello,

    I'm new to the world of scripting. About a week. I've searched, and found, and pieced together just about everything I've needed this past week, but I can't find anything that shows me how to do this...

    I have a directory full of text files. Each text file has a file name... obviously. lol

    Inside each text file there are occurrences of the file name. Like this for example...

    file name for a specific file is: dog eats cat.txt

    inside that file there may or may not be instances of the file name. For example there may be a sentence in there that says... 'my dog eats turkey, but mostly my dog eats cat.'

    The file name phrase is 'dog eats cat'... and inside the file there are likely to be multiple instances of that phrase, but not always.

    Could a little batch script go inside each file in a directory and then look inside each file to see if there are instances of the file name phrase in there... and if yes, then delete the phrase?

    Thanks for any help you're willing to offer on this.

    Best,
    ed

  • #2
    Re: Delete all occurrences of file name inside text file. Possible?

    Originally posted by edsmithers View Post
    Could a little batch script go inside each file in a directory and then look inside each file to see if there are instances of the file name phrase in there... and if yes, then delete the phrase?
    Get a list of the files.
    Loop through the list.
    Pick the file name into a variable.
    Open each file into memory or do stream processing.
    Replace each match with empty a string.
    Save the results.
    Back to loop.

    Now, how to implement this depends on the tools you want to use. I think batch can be used, but VBScript and Powershell are more versatile solutions. The batch solution depends on tokenizing each row and is much work. A regexp would do much better, but those aren't available in batch.

    -vP

    Comment


    • #3
      Re: Delete all occurrences of file name inside text file. Possible?

      Try this found on experts-exchange . It will delete the line that contains the string in the text file.

      Code:
      Const ForReading = 1
      Const ForWriting = 2
      Const TriStateUseDefault = -2
       
      strRoot = "C:\LogFiles"
      strExt = "txt"
      strToFind = "dog eats cat"
       
      Set objFSO = CreateObject("Scripting.FileSystemObject")
       
      Set objRoot = objFSO.GetFolder(strRoot)
      Set colFiles = objRoot.Files
       
      For Each objFile in colFiles
          If LCase(objFSO.GetExtensionName(objFile.Path)) = LCase(strExt) Then
              EditFile objFile.Path
          End If
      Next
       
      Set colSubFolders = objRoot.SubFolders
       
      For Each objFolder in colSubfolders
          GetSubFolders objFolder.Path
      Next
       
      Sub GetSubFolders(strFolderPath)
          Set objSub = objFSO.GetFolder(strFolderPath)
          
          Set colFiles2 = objSub.Files
       
          For Each objFile2 in colFiles2
              If LCase(objFSO.GetExtensionName(objFile2.Path)) = LCase(strExt) Then
                  EditFile objFile2.Path
              End If
          Next
              
          Set colSubfolders2 = objSub.SubFolders
       
          For Each objFolder2 in colSubfolders2
              GetSubFolders objFolder2.Path
          Next
      End Sub
       
      Sub EditFile(strTextFile)
          blnFound = False
          strNewFile = ""
          
          Set objTextFile = objFSO.OpenTextFile(strTextFile, ForReading, False, TriStateUseDefault)
       
          Do Until objTextFile.AtEndOfStream
              strNextLine = objTextFile.Readline
       
              intLineFinder = InStr(strNextLine, strToFind)
              If intLineFinder = 0 Then
                  strNewFile = strNewFile & strNextLine & vbCrLf
              Else
                  blnFound = True
              End If
          Loop
       
          objTextFile.Close
       
          If blnFound Then
              Set objTextFile = objFSO.OpenTextFile(strTextFile, ForWriting)
              objTextFile.Write strNewFile
              objTextFile.Close
          End If
      End Sub

      Comment


      • #4
        Re: Delete all occurrences of file name inside text file. Possible?

        Originally posted by vonPryz View Post
        The batch solution depends on tokenizing each row and is much work. A regexp would do much better, but those aren't available in batch.
        Actualy there is a 'Pattern test' available for batch by using Findstr /R
        However, not all of the Regular Expressions metacharacters are supported and 'Pattern Matching for Replacement' is not available.

        Here is a simple batch that can replace a matching string from txt-files found in a certain directory. It is using RegExp to pre-check the file and lines but not using Regexp when the string actually needs to be replaced.
        Code:
        :: #####################################################################
        :: This batch replaces a certain string from a txt-file
        :: when it matches the name of that file.
        :: #####################################################################
        
        @echo off & color 6A
        SETLOCAL EnableDelayedExpansion
        
        rem # Search the following folder and its subfolders
        rem # for txt-files
        SET "startFolder=C:\temp"
        
        
        Set "ReplaceStr=%%~n*" (= will expand name of file later on)
        Set "ReplaceWith="    (= empty)
        
        rem # Do find each txt-file,
        For /R "%startFolder%" %%* in (*.txt) Do (
          Set "strFile=%%~*"
        
          Call Set ReplaceStr=%ReplaceStr%
          Call Set ReplaceWith=%ReplaceWith%
        
        rem # test each file first if it is needed to be overwritten,
          FINDSTR /PIM /R "\<!ReplaceStr!\>" "!strFile!" &&(
        
        rem # if string was found, loop through the lines (non-empty lines only)
            For /F "usebackq delims=" %%! in ("!strFile!") Do (
              Set strLine=%%!
        
        rem # test each line first if it is needed to be replaced,
              echo/!strLine!|FINDSTR /INA:66 /R "\<!ReplaceStr!\>" &&(
                Call:DoReplace "!ReplaceStr! " "!ReplaceWith!"
                Call:DoReplace " !ReplaceStr!" "!ReplaceWith!"
                Call:DoReplace "!ReplaceStr!" "!ReplaceWith!"
              )
        
        rem # output (note, any existing empty lines are removed!),
              echo/!strLine!>>"%temp%.\$%~n0$.tmp"
            )
            Move "%temp%.\$%~n0$.tmp" "!strFile!"
          )
        )
        
        pause
        
        goto:eof  +--------+
        :DoReplace Subroutine
        Set strLine=!strLine:%~1=%~2!
        goto:eof  +--------+
        NOTE: because the For statement processing is depending on breaking a stream of characters per line into meaningful units (process called tokenizing) it will therefore skip existing empty lines! Furthermore... there is this undocumented behavior of For-do loops of skipping also lines starting with a semicolon (";").
        It is like vonPryz already have stated it is therefor better to use regexp in a vbscript and using the RegExp Replace method. Or use a PowerShell script with the -match and -replace operators. To avoid the For-Do behavior and limitations and regexp limitation with a batch.
        The benefit of using regexp for matching and replacing a string is when a part of the strings matches like "dog eats cats" or "hotdog eats cat" it will not considered a match. The string can be at the beginning, end or in the middel of a text.


        \Rems
        Last edited by Rems; 8th April 2009, 16:06.

        This posting is provided "AS IS" with no warranties, and confers no rights.

        __________________

        ** Remember to give credit where credit's due **
        and leave Reputation Points for meaningful posts

        Comment


        • #5
          Re: Delete all occurrences of file name inside text file. Possible?

          Originally posted by edsmithers View Post
          Hello,

          I'm new to the world of scripting. About a week. I've searched, and found, and pieced together just about everything I've needed this past week, but I can't find anything that shows me how to do this...

          I have a directory full of text files. Each text file has a file name... obviously. lol

          Inside each text file there are occurrences of the file name. Like this for example...

          file name for a specific file is: dog eats cat.txt

          inside that file there may or may not be instances of the file name. For example there may be a sentence in there that says... 'my dog eats turkey, but mostly my dog eats cat.'

          The file name phrase is 'dog eats cat'... and inside the file there are likely to be multiple instances of that phrase, but not always.

          Could a little batch script go inside each file in a directory and then look inside each file to see if there are instances of the file name phrase in there... and if yes, then delete the phrase?

          Thanks for any help you're willing to offer on this.

          Best,
          ed
          if you can use Perl
          Code:
          use File::Basename;
          while ( my $textfile = <*.txt> ){
            my $filename_withoutextension = basename($textfile,  ".txt");    
            open(FH, "<", $textfile);
            open(TEMP, ">>" ,"temp");
            while ( my $line = <FH>){
              $line =~ s/$filename_withoutextension//g        ;
              print TEMP $line;
            }
            close(FH);
            close(TEMP);
            rename("temp",$textfile);  
          }

          Comment

          Working...
          X