Talk:Text processing/1: Difference between revisions

Content added Content deleted
(Why this task?)
 
(Please clarify it)
Line 1: Line 1:
==Why?==
==Why?==
I was reading through [http://paddy3118.blogspot.com/2007/01/data-mining-in-three-language05.html old blog entries] and thought it would be appropriate (minus the focus on speed).
I was reading through [http://paddy3118.blogspot.com/2007/01/data-mining-in-three-language05.html old blog entries] and thought it would be appropriate (minus the focus on speed).

==Please clarify the task==
# Syntax errors in the file to be detected?
# The field separator is what? One space, any non-empty chain of spaces, any non-empty chain of spaces or tabs. Something else?
# Average to evaluate over all fields or else over each field separately?
# When at the same line some fields are flagged invalid but others are not, is it a gap? Or is it only when all fields are invalid?
# Further, do valid fields participate in averaging when some other fields at the same line are invalid?
# When a field is not present is it a syntax error or a gap?
# What to do when syntactically wrong fields appear (not a number, too large number etc)?
--[[User:Dmitry-kazakov|Dmitry-kazakov]] 12:13, 8 November 2008 (UTC)