Globally replace text in several files

From Rosetta Code
Revision as of 12:53, 14 October 2012 by Edwin (talk | contribs) (Added Perl 6 solution.)
Task
Globally replace text in several files
You are encouraged to solve this task according to the task description, using any language you may know.

The task is to replace every occuring instance of a piece of text in a group of text files with another one. For this task we want to replace the text "Goodbye London!" with "Hello New York!" for a list of files.

Ada

<lang Ada>with Ada.Strings.Unbounded, Ada.Text_IO, Ada.Command_Line, Ada.Directories;

procedure Global_Replace is

  subtype U_String is Ada.Strings.Unbounded.Unbounded_String;
  function "+"(S: String) return U_String renames
    Ada.Strings.Unbounded.To_Unbounded_String;
  function "-"(U: U_String) return String renames
    Ada.Strings.Unbounded.To_String;
  procedure String_Replace(S: in out U_String; Pattern, Replacement: String) is
     -- example: if S is "Mary had a XX lamb", then String_Replace(S, "X", "little");
     --          will turn S into "Mary had a littlelittle lamb"
     --          and String_Replace(S, "Y", "small"); will not change S
     Index : Natural;
  begin
     loop
        Index := Ada.Strings.Unbounded.Index(Source => S, Pattern => Pattern);
        exit when Index = 0;
        Ada.Strings.Unbounded.Replace_Slice
          (Source => S, Low => Index, High => Index+Pattern'Length-1,
           By => Replacement);
     end loop;
  end String_Replace;
  procedure File_Replace(Filename: String; Pattern, Replacement: String) is
     -- applies String_Rplace to each line in the file with the given Filename
     -- propagates any exceptions, when, e.g., the file does not exist 
     I_File, O_File: Ada.Text_IO.File_Type;
     Line: U_String;
     Tmp_Name: String := Filename & ".tmp"; 
        -- name of temporary file; if that file already exists, it will be overwritten
  begin
     Ada.Text_IO.Open(I_File, Ada.Text_IO.In_File, Filename);
     Ada.Text_IO.Create(O_File, Ada.Text_IO.Out_File, Tmp_Name);
     while not Ada.Text_IO.End_Of_File(I_File) loop
        Line := +Ada.Text_IO.Get_Line(I_File);
        String_Replace(Line, Pattern, Replacement);
        Ada.Text_IO.Put_Line(O_File, -Line);
     end loop;
     Ada.Text_IO.Close(I_File);
     Ada.Text_IO.Close(O_File);
     Ada.Directories.Delete_File(Filename);
     Ada.Directories.Rename(Old_Name => Tmp_Name, New_Name => Filename);
  end File_Replace;
  Pattern:     String := Ada.Command_Line.Argument(1);
  Replacement: String :=  Ada.Command_Line.Argument(2);

begin

  Ada.Text_IO.Put_Line("Replacing """ & Pattern
                         & """ by """ & Replacement & """ in"
                         & Integer'Image(Ada.Command_Line.Argument_Count - 2)
                         & " files.");
  for I in 3 .. Ada.Command_Line.Argument_Count loop
     File_Replace(Ada.Command_Line.Argument(I), Pattern, Replacement);
  end loop;

end Global_Replace;</lang>

Ouput:

> ls ?.txt
1.txt  2.txt  x.txt  y.txt

> more 2.txt
This is a text.
"Goodbye London!" 
"Goodbye London!" 
"Byebye London!" "Byebye London!" "Byebye London!" 

> ./global_replace "Goodbye London" "Hello New York" ?.txt
Replacing "Goodbye London" by "Hello New York" in 4 files.

> more 2.txt
This is a text.
"Hello New York!" 
"Hello New York!" 
"Byebye London!" "Byebye London!" "Byebye London!" 

AutoHotkey

<lang AutoHotkey>SetWorkingDir %A_ScriptDir%  ; Change the working directory to the script's location listFiles := "a.txt|b.txt|c.txt" ; Define a list of files in the current working directory loop, Parse, listFiles, | { ; The above parses the list based on the | character fileread, contents, %A_LoopField% ; Read the file fileDelete, %A_LoopField%  ; Delete the file stringReplace, contents, contents, Goodbye London!, Hello New York!, All ; replace all occurrences fileAppend, %contents%, %A_LoopField% ; Re-create the file with new contents } </lang>

BASIC

Works with: FreeBASIC

Pass the files on the command line (i.e. global-replace *.txt).

<lang qbasic>CONST matchtext = "Goodbye London!" CONST repltext = "Hello New York!" CONST matchlen = LEN(matchtext)

DIM L0 AS INTEGER, x AS INTEGER, filespec AS STRING, linein AS STRING

L0 = 1 WHILE LEN(COMMAND$(L0))

   filespec = DIR$(COMMAND$(L0))
   WHILE LEN(filespec)
       OPEN filespec FOR BINARY AS 1
           linein = SPACE$(LOF(1))
           GET #1, 1, linein
           DO
               x = INSTR(linein, matchtext)
               IF x THEN
                   linein = LEFT$(linein, x - 1) & repltext & MID$(linein, x + matchlen)
                   ' If matchtext and repltext are of equal length (as in this example)
                   ' then you can replace the above line with this:
                   ' MID$(linein, x) = repltext
                   ' This is somewhat more efficient than having to rebuild the string.
               ELSE
                   EXIT DO
               END IF
           LOOP
       ' If matchtext and repltext are of equal length (as in this example), or repltext
       ' is longer than matchtext, you could just write back to the file while it's open
       ' in BINARY mode, like so:
       ' PUT #1, 1, linein
       ' But since there's no way to reduce the file size via BINARY and PUT, we do this:
       CLOSE
       OPEN filespec FOR OUTPUT AS 1
           PRINT #1, linein;
       CLOSE
       filespec = DIR$
   WEND
   L0 += 1

WEND</lang>

C

<lang C>#include <stdio.h>

  1. include <stdlib.h>
  2. include <stddef.h>
  3. include <string.h>
  4. include <sys/types.h>
  5. include <fcntl.h>
  6. include <sys/stat.h>
  7. include <unistd.h>
  8. include <err.h>
  9. include <string.h>

char * find_match(char *buf, char * buf_end, char *pat, size_t len) { ptrdiff_t i; char *start = buf; while (start + len < buf_end) { for (i = 0; i < len; i++) if (start[i] != pat[i]) break;

if (i == len) return start; start++; } return 0; }

int replace(char *from, char *to, char *fname) {

  1. define bail(msg) { warn(msg" '%s'", fname); goto done; }

struct stat st; int ret = 0; char *buf = 0, *start, *end; size_t len = strlen(from), nlen = strlen(to); int fd = open(fname, O_RDWR);

if (fd == -1) bail("Can't open"); if (fstat(fd, &st) == -1) bail("Can't stat"); if (!(buf = malloc(st.st_size))) bail("Can't alloc"); if (read(fd, buf, st.st_size) != st.st_size) bail("Bad read");

start = buf; end = find_match(start, buf + st.st_size, from, len); if (!end) goto done; /* no match found, don't change file */

ftruncate(fd, 0); lseek(fd, 0, 0); do { write(fd, start, end - start); /* write content before match */ write(fd, to, nlen); /* write replacement of match */ start = end + len; /* skip to end of match */ /* find match again */ end = find_match(start, buf + st.st_size, from, len); } while (end);

/* write leftover after last match */ if (start < buf + st.st_size) write(fd, start, buf + st.st_size - start);

done: if (fd != -1) close(fd); if (buf) free(buf); return ret; }

int main() { char *from = "Goodbye, London!"; char *to = "Hello, New York!"; char * files[] = { "test1.txt", "test2.txt", "test3.txt" }; int i;

for (i = 0; i < sizeof(files)/sizeof(char*); i++) replace(from, to, files[i]);

return 0; }</lang>

C++

<lang cpp>#include <fstream>

  1. include <iterator>
  2. include <boost/regex.hpp>
  3. include <string>
  4. include <iostream>

int main( int argc , char *argv[ ] ) {

  boost::regex to_be_replaced( "Goodbye London\\s*!" ) ;
  std::string replacement( "Hello New York!" ) ;
  for ( int i = 1 ; i < argc ; i++ ) {
     std::ifstream infile ( argv[ i ] ) ;
     if ( infile ) {

std::string filetext( (std::istreambuf_iterator<char>( infile )) , std::istreambuf_iterator<char>( ) ) ; std::string changed ( boost::regex_replace( filetext , to_be_replaced , replacement )) ; infile.close( ) ; std::ofstream outfile( argv[ i ] , std::ios_base::out | std::ios_base::trunc ) ; if ( outfile.is_open( ) ) { outfile << changed ; outfile.close( ) ; }

     }
     else 

std::cout << "Can't find file " << argv[ i ] << " !\n" ;

  }
  return 0 ;

}</lang>

D

Works with: D version 2

<lang d>import std.file, std.array;

void main() {

   auto from = "Goodbye London!", to = "Hello, New York!";
   foreach (fn; "a.txt b.txt c.txt".split()) {
       write(fn, replace(cast(string)read(fn), from, to));
   }

}</lang>

Go

<lang go>package main

import (

   "bytes"
   "io/ioutil"
   "log"
   "os"

)

func main() {

   gRepNFiles("Goodbye London!", "Hello New York!", []string{
       "a.txt",
       "b.txt",
       "c.txt",
   })

}

func gRepNFiles(olds, news string, files []string) {

   oldb := []byte(olds)
   newb := []byte(news)
   for _, fn := range files {
       if err := gRepFile(oldb, newb, fn); err != nil {
           log.Println(err)
       }
   }

}

func gRepFile(oldb, newb []byte, fn string) (err error) {

   var f *os.File
   if f, err = os.OpenFile(fn, os.O_RDWR, 0); err != nil {
       return
   }
   defer func() {
       if cErr := f.Close(); err == nil {
           err = cErr
       }
   }()
   var b []byte
   if b, err = ioutil.ReadAll(f); err != nil {
       return
   }
   if bytes.Index(b, oldb) < 0 {
       return
   }
   r := bytes.Replace(b, oldb, newb, -1)
   if err = f.Truncate(0); err != nil {
       return
   }
   _, err = f.WriteAt(r, 0)
   return

}</lang>

Icon and Unicon

This example uses the Unicon stat function. It can be rewritten for Icon to aggregate the file in a reads loop. <lang Icon>procedure main() globalrepl("Goodbye London","Hello New York","a.txt","b.txt") # variable args for files end

procedure globalrepl(old,new,files[])

every fn := !files do

  if s := reads(f := open(fn,"bu"),stat(f).size) then {
     writes(seek(f,1),replace(s,old,new))
     close(f)
     }
  else write(&errout,"Unable to open ",fn)

end

link strings # for replace</lang>

strings.icn provides replace.

J

If files is a variable with the desired list of file names:

<lang j>require'strings' (1!:2~rplc&('Goodbye London!';'Hello New York!')@(1!:1))"0 files</lang>


Liberty BASIC

<lang lb> nomainwin

file$( 1) ="data1.txt" file$( 2) ="data2.txt" file$( 3) ="data3.txt"


for i =1 to 3

   open file$( i) for input as #i
       orig$ =input$( #i, lof( #i))
   close #i
   dummy$ =FindReplace$( orig$, "Goodbye London!", "Hello New York!", 1)
   open "RC" +file$( i) for output as #o
       #o dummy$;
   close #o

next i

end

function FindReplace$( FindReplace$, find$, replace$, replaceAll) ' Target string, string to find, string to replace it with, flag 0/1 for 'replace all occurrences'.

   if ( ( FindReplace$ <>"") and ( find$ <>"") ) then
       fLen =len( find$)
       rLen =len( replace$)
       do
           fPos =instr( FindReplace$, find$, fPos)
           if not( fPos) then exit function
           pre$            =left$( FindReplace$, fPos -1)
           post$           =mid$( FindReplace$, fPos +fLen)
           FindReplace$    =pre$ +replace$ +post$
           fPos            =fPos +( rLen -fLen) +1
       loop while (replaceAll)
   end if

end function </lang>

Lua

<lang lua>filenames = { "f1.txt", "f2.txt" }

for _, fn in pairs( filenames ) do

   fp = io.open( fn, "r" )
   str = fp:read( "*all" )
   str = string.gsub( str, "Goodbye London!", "Hello New York!" )
   fp:close()
   fp = io.open( fn, "w+" )
   fp:write( str )
   fp:close()

end</lang>


OpenEdge/Progress

<lang progress>FUNCTION replaceText RETURNS LOGICAL (

  i_cfile_list   AS CHAR,
  i_cfrom        AS CHAR,
  i_cto          AS CHAR

):

  DEF VAR ii     AS INT.
  DEF VAR lcfile AS LONGCHAR.
  DO ii = 1 TO NUM-ENTRIES( i_cfile_list ):
     COPY-LOB FROM FILE ENTRY( ii, i_cfile_list ) TO lcfile.
     lcfile = REPLACE( lcfile, i_cfrom, i_cto ).
     COPY-LOB FROM lcfile TO FILE ENTRY( ii, i_cfile_list ).
  END.
  

END FUNCTION. /* replaceText */

replaceText(

  "a.txt,b.txt,c.txt",
  "Goodbye London!",
  "Hello New York!"

).</lang>

Pascal

Works with: Free_Pascal

<lang pascal>Program StringReplace;

uses

 Classes, StrUtils;

const

 fileName: array[1..3] of string = ('a.txt', 'b.txt', 'c.txt');
 matchText = 'Goodbye London!';
 replaceText = 'Hello New York!';

var

 AllText: TStringlist;
 i, j: integer;

begin

 for j := low(fileName) to high(fileName) do
 begin
  AllText := TStringlist.Create;
  AllText.LoadFromFile(fileName[j]);
  for i := 0 to AllText.Count-1 do
    AllText.Strings[i] := AnsiReplaceStr(AllText.Strings[i], matchText, replaceText);
  AllText.SaveToFile(fileName[j]);
  AllText.Destroy;
 end;

end.</lang>

Perl

<lang bash>perl -pi -e "s/Goodbye London\!/Hello New York\!/g;" a.txt b.txt c.txt</lang>

Perl 6

Current Perl 6 implementations do not yet support the -i flag for editing files in place, so we roll our own (rather unsafe) version:

<lang perl6>spurt $_, slurp($_).subst('Goodbye London!', 'Hello New York!', :g)

   for <a.txt b.txt c.txt>;</lang>

PicoLisp

<lang PicoLisp>(for File '(a.txt b.txt c.txt)

  (call 'mv File (tmp File))
  (out File
     (in (tmp File)
        (while (echo "Goodbye London!")
           (prin "Hello New York!") ) ) ) )</lang>

PowerBASIC

Translation of: BASIC

<lang powerbasic>$matchtext = "Goodbye London!" $repltext = "Hello New York!"

FUNCTION PBMAIN () AS LONG

   DIM L0 AS INTEGER, filespec AS STRING, linein AS STRING
   L0 = 1
   WHILE LEN(COMMAND$(L0))
       filespec = DIR$(COMMAND$(L0))
       WHILE LEN(filespec)
           OPEN filespec FOR BINARY AS 1
               linein = SPACE$(LOF(1))
               GET #1, 1, linein
               ' No need to jump through FB's hoops here...
               REPLACE $matchtext WITH $repltext IN linein
               PUT #1, 1, linein
               SETEOF #1
           CLOSE
           filespec = DIR$
       WEND
       INCR L0
   WEND

END FUNCTION</lang>

PureBasic

<lang PureBasic>Procedure GRTISF(List File$(), Find$, Replace$)

 Protected Line$, Out$, OutFile$, i
 ForEach File$()
   fsize=FileSize(File$())
   If fsize<=0: Continue: EndIf
   If ReadFile(0, File$())
     i=0
     ;
     ; generate a temporary file in a safe way
     Repeat
       file$=GetTemporaryDirectory()+base$+"_"+Str(i)+".tmp"
       i+1
     Until FileSize(file$)=-1
     i=CreateFile(FileID, file$)
     If i
       ; Copy the infile to the outfile while replacing any needed text
       While Not Eof(0)
         Line$=ReadString(0)
         Out$=ReplaceString(Line$,Find$,Replace$)
         WriteString(1,Out$)
       Wend
       CloseFile(1)
     EndIf
     CloseFile(0)
     If i
       ; If we made a new file, copy it back.
       CopyFile(file$, File$())
       DeleteFile(file$)
     EndIf
   EndIf
 Next

EndProcedure</lang> Implementation

NewList Xyz$()
AddElement(Xyz$()): Xyz$()="C:\\a.txt"
AddElement(Xyz$()): Xyz$()="C:\\b.txt"
AddElement(Xyz$()): Xyz$()="D:\\c.txt"

GRTISF(Xyz$(), "Goodbye London", "Hello New York")

Python

From Python docs. (Note: in-place editing does not work for MS-DOS 8+3 filesystems.).

<lang python>import fileinput

for line in fileinput.input(inplace=True):

   print(line.replace('Goodbye London!', 'Hello New York!'), end=)

</lang>

Ruby

Like Perl:

ruby -pi -e "gsub('Goodbye London!', 'Hello New York!')" a.txt b.txt c.txt

Run BASIC

<lang runbasic>file$(1) ="data1.txt" file$(2) ="data2.txt" file$(3) ="data3.txt"

for i = 1 to 3

   open file$(i) for input as #in
       fileBefore$ = input$( #in, lof( #in))
   close #in

   fileAfter$ = strRep$(fileBefore$, "Goodbye London!", "Hello New York!")
   open "new_" +  file$(i) for output as #out
       print #out,fileAfter$;
   close #out

next i end

' -------------------------------- ' string replace - rep str with ' -------------------------------- FUNCTION strRep$(str$,rep$,with$) ln = len(rep$) ln1 = ln - 1 i = 1 while i <= len(str$)

   if mid$(str$,i,ln) = rep$ then
       strRep$ = strRep$ + with$
       i = i + ln1
   else
       strRep$ = strRep$ + mid$(str$,i,1)
   end if

i = i + 1 WEND END FUNCTION</lang>

Seed7

<lang seed7>$ include "seed7_05.s7i";

 include "getf.s7i";

const proc: main is func

 local
   var string: fileName is "";
   var string: content is "";
 begin
   for fileName range [] ("a.txt", "b.txt", "c.txt") do
     content := getf(fileName);
     content := replace(content, "Goodbye London!", "Hello New York!");
     putf(fileName, content);
   end for;
 end func;</lang>

Tcl

Library: Tcllib (Package: fileutil)

<lang tcl>package require Tcl 8.5 package require fileutil

  1. Parameters to the replacement

set from "Goodbye London!" set to "Hello New York!"

  1. Which files to replace

set fileList [list a.txt b.txt c.txt]

  1. Make a command fragment that performs the replacement on a supplied string

set replacementCmd [list string map [list $from $to]]

  1. Apply the replacement to the contents of each file

foreach filename $fileList {

   fileutil::updateInPlace $filename $replacementCmd

}</lang>

TUSCRIPT

<lang tuscript> $$ MODE TUSCRIPT files="a.txt'b.txt'c.txt"

BUILD S_TABLE search = ":Goodbye London!:"

LOOP file=files

ERROR/STOP OPEN (file,WRITE,-std-)
ERROR/STOP CREATE ("scratch",FDF-o,-std-)
 ACCESS q: READ/STREAM/RECORDS/UTF8 $file s,aken+text/search+eken
 ACCESS s: WRITE/ERASE/STREAM/UTF8 "scratch" s,aken+text+eken
  LOOP
   READ/EXIT q
   IF (text.ct.search) SET text="Hello New York!"
   WRITE/ADJUST s
  ENDLOOP
 ENDACCESS/PRINT q
 ENDACCESS/PRINT s
ERROR/STOP COPY ("scratch",file)
ERROR/STOP CLOSE (file)

ENDLOOP ERROR/STOP DELETE ("scratch") </lang>

TXR

Another use of a screwdriver as a hammer.

The dummy empty output at the end serves a dual purpose. Firstly, without argument clauses following it, the @(next `!mv ...`) will not actually happen (lazy evaluation!). Secondly, if a txr script performs no output on standard output, the default action of dumping variable bindings kicks in.

<lang txr>@(next :args) @(collect) @file @(next `@file`) @(freeform) @(coll :gap 0)@notmatch@{match /Goodbye, London!/}@(end)@*tail@/\n/ @(output `@file.tmp`) @(rep)@{notmatch}Hello, New York!@(end)@tail @(end) @(next `!mv @file.tmp @file`) @(output) @(end) @(end)</lang> Run:

$ cat foo.txt
aaaGoodbye, London!aaa
Goodbye, London!
$ cat bar.txt
aaaGoodbye, London!aaa
Goodbye, London!
$ txr replace-files.txr foo.txt bar.txt
$ cat foo.txt
aaaHello, New York!aaa
Hello, New York!
$ cat bar.txt
aaaHello, New York!aaa
Hello, New York!

Run, with no directory permissions:

$ chmod a-w .
$ txr replace-files.txr foo.txt bar.txt
txr: unhandled exception of type file_error:
txr: could not open foo.txt.tmp (error 13/Permission denied)
false