File extension is in extensions list: Difference between revisions

→‎{{header|Fortran}}: Follow the new requirements.
(→‎{{header|Fortran}}: Follow the new requirements.)
Line 232:
 
=={{header|Fortran}}==
The plan is to use the extractor function for file name extensions that was defined in another problem to obtain the extension's text, then use that text to index a character string of approved extensions. These are defined with the period included, but no extension's text can include a period - all start with a period and continue with letters or digits only - so the specification of a list of such items need not mess about with quotes and commas and so forth. The deed is done via function EXTIN(FNAME,LIST), but there is a lot of support stuff in the absence of an existing library to call upon.
 
When the task was revised to include a # as a possible character in the file name extension part, the scan in FEXT that allowed only certain characters via GOODEXT required changing. One possibility was to increase the size of GOODEXT so as to encompass the three characters in ODDITIES, that includes a #. But it also includes a : (all this is from a different context) and in file names, under DOS, this character is special. So, all-in-all, it seemed better to retreat from exclusions and simply allow all characters, as per the original specification. This done, the function EXTIN that checked that the extension was one of a specified set seemed no longer restricted to file name extensions, so instead it became FOUND(TEXT,LIST) and the period, the only special character now, is used as the delimiter in LIST.
{{update|Fortran}}
 
The plan is to use the extractor function for file name extensions that was defined in another problem to obtain the extension's text, then use that text to index a character string of approved extensions. These are defined with the period included, but no extension's text can include a period - all start with a period and continue with letters or digits only - so the specification of a list of such items need not mess about with quotes and commas and so forth. The deed is done via function EXTIN(FNAME,LIST), but there is a lot of support stuff in the absence of an existing library to call upon.
 
Petty details include the list employing capitals only, as the match is to not distinguish capital from lower case letters, so in the usual way the candidate extension's text is converted to capitals. The list could be subjected to UPCASE, but it is bad form to damage what might be a constant, and one possibly in read-only storage at that. An internal working copy could be made which would then be fed to UPCASE, except that this would be a waste on every invocation. A further trick involves appending a period to the candidate text so that for example ".JP" becomes ".JP." - otherwise a ".JP" would be found in the sequence ".JPG" which would be wrong, so as a result, the list of texts must have a period appended to its last entry, otherwise it would not be findable. Again, this could be done internally, via <code>INDEX(LIST//".",EXT(1:L)//".")</code> at a run-time cost.
Line 244 ⟶ 243:
A final annoyance is the presence of trailing spaces because character variables are of fixed size and so must be made "surely long enough" for the longest expectation. This may not cause trouble on output as spaces look just as blank as blank space, but they may well cause trouble in the internal tests. Thus integer function LSTNB reports the last non-blank, and so a test text can be presented with no trailing spaces, as in TEST(I)(1:LSTNB(TEST(I))) or similar. With Fortran 2003, there is a facility for redefining the sizes of character variables on-the-fly so that this problem can be evaded.
 
The MODULE protocol is employed for the convenience of not having to respecify the type of the functions in every calling routine, and also to facilitate the collection of types of characters. Otherwise, prior to F90 there would have to be various COMONCOMMON statements, or routines would each simply respecify whatever character sets they needed.<lang Fortran> MODULE TEXTGNASH !Some text inspection.
CHARACTER*10 DIGITS !Integer only.
CHARACTER*11 DDIGITS !With a full stop masquerading as a decimal point.
Line 272 ⟶ 271:
 
CHARACTER*62 GOODEXT !These are all the characters allowed
EQUIVALENCE (CHARACTER(8),GOODEXT) !Starts with the first digit.
INTEGER MEXT !A fixed bound.
PARAMETER (MEXT = 28) !This shouldis doperfect.
CONTAINS
INTEGER FUNCTION LSTNB(TEXT) !Sigh. Last Not Blank.
Line 319 ⟶ 318:
L1 = L2 !Starting at the end...
10 IF (L1.GT.0) THEN !Damnit, can't rely on DO WHILE(safe & test)
IF (INDEX(GOODEXT,FNAME(L1:L1)).GTNE.".0") THEN !So do the two parts explicitly.
L1 = L1 - 1 !Well, that was a valid character for an extension.
GO TO 10 !So, move back one and try again.
END IF !Until thea endperiod ofis valid stufffound.
IFL1 = (FNAME(L1:L1).EQ.".") THEN- 1 !Stopped hereHere. AThus properinclude introduction?the period.
GO TO L1 = L1 - 1 20 !Yes. Include theAnd periodescape.
GO TO 20 !And escape.
END IF !Otherwise, not valid stuff.
END IF !Keep on moving back.
L1 = L2 !If we're here, no period was found.
Line 332 ⟶ 329:
END FUNCTION FEXT !Possibly, blank.
 
LOGICAL FUNCTION EXTINFOUND(FNAMETEXT,LIST) !WhichIs fileTEXT namefound extensionin the LIST?
CHARACTER*(*) FNAMETEXT !The file nametext.
CHARACTER*(*) LIST !A sequence, separated by the periods. Like ".EXREXE.TXT.etc."
CHARACTER*(MEXTLEN(TEXT)) EXT !A scratchpad of sufficient size.
EXTL = FEXTLSTNB(FNAMETEXT) !So,Find obtainits thelast file name's extensionnon-blank.
L = LSTNB(EXT) !Find its last non-blank.
IF (L.LE.0) THEN !A null text?
EXTINFOUND = .FALSE. !Yep. Can't be in the list.
ELSE !Otherwise,
EXT(1:L) = TEXT(1:L) !A copy I can damage.
CALL UPCASE(EXT(1:L)) !Simplify.
EXTINFOUND = INDEX(LIST,EXT(1:L)//".") .GT. 0 !Found somewhere?
END IF !The period can't be a character in an extension name.
END FUNCTION EXTINFOUND !So, possibilities collapse.
END MODULE TEXTGNASH !Enough for this.
 
PROGRAM POKE
USE TEXTGNASH
INTEGER I,LLEX,LFN
INTEGER TESTS
CHARACTER*80 TEST(6) !A collection.
PARAMETER (TESTS = 12)
CHARACTER*80 TEST(6TESTS) !A collection.
CHARACTER*(MEXT) EXT
DATA TEST/
1 "Picture.jpg",
2 "http://mywebsite.com/picture/image.png",
3 "myuniquefile.longextension",
4 "IAmAFileWithoutExtensionIAmAFilenameWithoutExtension",
5 "/path/to.my/file",
6 "file.odd_one"/,
7 CHARACTER*(*) IMAGE"Mydata.a##",
8 "Mydata.tar.Gz",
PARAMETER (IMAGE = ".PNG.JPG.") !All in capitals, and ending with a . too.
9 "MyData.gzip",
o "MyData.7z.backup",
1 "Mydata...",
2 "Mydata"/
CHARACTER*(*) IMAGE !A sequence of approved texts, delimited by .
PARAMETER (IMAGE = ".PNGZIP.JPGRAR.7Z.GZ.ARCHIVE.A##.") !All in capitals, and ending with a . too.
 
WRITE (6,1) IMAGE
Line 365 ⟶ 371:
1 "File name",40X,"Extension",7X,"Test")
 
DO I = 1,6TESTS !Step through the candidates.
LLFN = LSTNB(TEST(I)) !Thus do without trailing spaces.
WRITEEXT (6,2)= TEST(I),FEXT(TEST(I)(1:LLFN)),EXTIN(TEST(I)(1:L),IMAGE) !Grab the file name's extension text.
LLEX = LSTNB(EXT) !Find itsMore lasttrailing non-blankspaces.
2 FORMAT (A48,A16,L)
WRITE (6,2) TEST(I)(1:LFN),EXT(1:LEX),FOUND(EXT(1:LEX),IMAGE)
END DO
2 FORMAT (A48,A16,L) !Produces tidy columns, aligned right.
END</lang>
END DO !On to the next.
Results, with "image" style files being the ones to approve.
END DO
END</lang>
The previous results were when only "image" style files were approved. The approval is no longer so restrictive.
<pre>
To note file name extensions that are amongst .PNGZIP.JPGRAR.7Z.GZ.ARCHIVE.A##.
File name Extension Test
Picture.jpg Picture.jpg .jpg TF
http://mywebsite.com/picture/image.png .png .png TF
myuniquefile.longextension myuniquefile.longextension .longextension F
IAmAFileWithoutExtension IAmAFilenameWithoutExtension F
/path/to.my/file /path/to.my/file .my/file F
file.odd_one file.odd_one .odd_one F
Mydata.a## .a## T
Mydata.tar.Gz .Gz T
MyData.gzip .gzip F
MyData.7z.backup .backup F
Mydata... . F
Mydata F
</pre>
The approval function relies on a period being special and that no extension contains one, as is stated in the specification. Thus, the likes of "tar.bz2" can't be added to the list of approved extensions. To do so would require some way of removing context dependency from within the list of approved tests, for instance by converting it to some sort of list of elements each of which has a length. So, if there were no more than 255 texts in a list, then character one would state their number, followed by the length of the first text (up to 255) then that number of characters, then the length of the second text, and so on. Then function FOUND would be fully general at the price of a more difficult initialisation of its list.
 
Similarly, the scan of a file name to find the extension part relies on no extension containing a period. If the caller were to determine the text of the extension (rather than using FEXT, which stops with the rightmost period) then whatever results could be presented to function FOUND without difficulty.
 
=={{header|Haskell}}==
1,220

edits