cpython/Demo/scripts/markov.py
Georg Brandl 22fff43633 Merged revisions 74609,74627,74634,74645,74651,74738,74840,75016,75316-75317,75323-75324,75326,75328,75330,75338,75340-75341,75343,75352-75353,75355,75357,75359 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

................
  r74609 | senthil.kumaran | 2009-08-31 18:43:45 +0200 (Mo, 31 Aug 2009) | 3 lines

  Doc fix for issue2637.
................
  r74627 | georg.brandl | 2009-09-02 22:31:26 +0200 (Mi, 02 Sep 2009) | 1 line

  #6819: fix typo.
................
  r74634 | georg.brandl | 2009-09-03 14:34:10 +0200 (Do, 03 Sep 2009) | 9 lines

  Merged revisions 74633 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r74633 | georg.brandl | 2009-09-03 14:31:39 +0200 (Do, 03 Sep 2009) | 1 line

    #6757: complete the list of types that marshal can serialize.
  ........
................
  r74645 | georg.brandl | 2009-09-04 10:07:32 +0200 (Fr, 04 Sep 2009) | 1 line

  #5221: fix related topics: SEQUENCEMETHODS[12] doesnt exist any more.
................
  r74651 | georg.brandl | 2009-09-04 13:20:54 +0200 (Fr, 04 Sep 2009) | 9 lines

  Recorded merge of revisions 74650 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r74650 | georg.brandl | 2009-09-04 13:19:34 +0200 (Fr, 04 Sep 2009) | 1 line

    #5101: add back tests to test_funcattrs that were lost during unittest conversion, and make some PEP8 cleanups.
  ........
................
  r74738 | georg.brandl | 2009-09-09 18:51:05 +0200 (Mi, 09 Sep 2009) | 9 lines

  Merged revisions 74737 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r74737 | georg.brandl | 2009-09-09 18:49:13 +0200 (Mi, 09 Sep 2009) | 1 line

    Properly document copy and deepcopy as functions.
  ........
................
  r74840 | georg.brandl | 2009-09-16 18:40:45 +0200 (Mi, 16 Sep 2009) | 13 lines

  Merged revisions 74838-74839 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r74838 | georg.brandl | 2009-09-16 18:22:12 +0200 (Mi, 16 Sep 2009) | 1 line

    Remove some more boilerplate from the actual tests in test_pdb.
  ........
    r74839 | georg.brandl | 2009-09-16 18:36:39 +0200 (Mi, 16 Sep 2009) | 1 line

    Make the pdb displayhook compatible with the standard displayhook: do not print Nones. Add a test for that.
  ........
................
  r75016 | georg.brandl | 2009-09-22 15:53:14 +0200 (Di, 22 Sep 2009) | 1 line

  #6969: make it explicit that configparser writes/reads text files, and fix the example.
................
  r75316 | georg.brandl | 2009-10-10 23:12:35 +0200 (Sa, 10 Okt 2009) | 9 lines

  Merged revisions 75313 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75313 | georg.brandl | 2009-10-10 23:07:35 +0200 (Sa, 10 Okt 2009) | 1 line

    Bring old demo up-to-date.
  ........
................
  r75317 | georg.brandl | 2009-10-10 23:13:21 +0200 (Sa, 10 Okt 2009) | 9 lines

  Merged revisions 75315 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75315 | georg.brandl | 2009-10-10 23:10:05 +0200 (Sa, 10 Okt 2009) | 1 line

    Remove unneeded "L" suffixes.
  ........
................
  r75323 | georg.brandl | 2009-10-10 23:48:05 +0200 (Sa, 10 Okt 2009) | 9 lines

  Recorded merge of revisions 75321 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75321 | georg.brandl | 2009-10-10 23:43:21 +0200 (Sa, 10 Okt 2009) | 1 line

    Remove outdated comment and fix a few style issues.
  ........
................
  r75324 | georg.brandl | 2009-10-10 23:49:24 +0200 (Sa, 10 Okt 2009) | 9 lines

  Merged revisions 75322 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75322 | georg.brandl | 2009-10-10 23:47:31 +0200 (Sa, 10 Okt 2009) | 1 line

    Show use of range() step argument nicely.
  ........
................
  r75326 | georg.brandl | 2009-10-10 23:57:03 +0200 (Sa, 10 Okt 2009) | 9 lines

  Merged revisions 75325 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75325 | georg.brandl | 2009-10-10 23:55:11 +0200 (Sa, 10 Okt 2009) | 1 line

    Modernize factorisation demo (mostly augassign.)
  ........
................
  r75328 | georg.brandl | 2009-10-11 00:05:26 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75327 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75327 | georg.brandl | 2009-10-11 00:03:43 +0200 (So, 11 Okt 2009) | 1 line

    Style fixes.
  ........
................
  r75330 | georg.brandl | 2009-10-11 00:32:28 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75329 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75329 | georg.brandl | 2009-10-11 00:26:45 +0200 (So, 11 Okt 2009) | 1 line

    Modernize all around (dont ask me how useful that script is nowadays...)
  ........
................
  r75338 | georg.brandl | 2009-10-11 10:31:41 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75337 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75337 | georg.brandl | 2009-10-11 10:18:44 +0200 (So, 11 Okt 2009) | 1 line

    Update morse script, avoid globals, use iterators.
  ........
................
  r75340 | georg.brandl | 2009-10-11 10:42:09 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75339 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75339 | georg.brandl | 2009-10-11 10:39:16 +0200 (So, 11 Okt 2009) | 1 line

    Update markov demo.
  ........
................
  r75341 | georg.brandl | 2009-10-11 10:43:08 +0200 (So, 11 Okt 2009) | 1 line

  Fix README description.
................
  r75343 | georg.brandl | 2009-10-11 10:46:56 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75342 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75342 | georg.brandl | 2009-10-11 10:45:03 +0200 (So, 11 Okt 2009) | 1 line

    Remove useless script "mkrcs" and update README.
  ........
................
  r75352 | georg.brandl | 2009-10-11 14:04:10 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75350 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75350 | georg.brandl | 2009-10-11 14:00:18 +0200 (So, 11 Okt 2009) | 1 line

    Use getopt in script.py demo.
  ........
................
  r75353 | georg.brandl | 2009-10-11 14:04:40 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75351 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75351 | georg.brandl | 2009-10-11 14:03:01 +0200 (So, 11 Okt 2009) | 1 line

    Fix variable.
  ........
................
  r75355 | georg.brandl | 2009-10-11 16:27:51 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75354 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75354 | georg.brandl | 2009-10-11 16:23:49 +0200 (So, 11 Okt 2009) | 1 line

    Update lpwatch script.
  ........
................
  r75357 | georg.brandl | 2009-10-11 16:50:57 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75356 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75356 | georg.brandl | 2009-10-11 16:49:37 +0200 (So, 11 Okt 2009) | 1 line

    Remove ftpstats script, the daemon whose log files it reads is long gone.
  ........
................
  r75359 | georg.brandl | 2009-10-11 17:56:06 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75358 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75358 | georg.brandl | 2009-10-11 17:06:44 +0200 (So, 11 Okt 2009) | 1 line

    Overhaul of Demo/xml.
  ........
................
2009-10-27 20:19:02 +00:00

122 lines
3.5 KiB
Python
Executable File

#! /usr/bin/env python
class Markov:
def __init__(self, histsize, choice):
self.histsize = histsize
self.choice = choice
self.trans = {}
def add(self, state, next):
self.trans.setdefault(state, []).append(next)
def put(self, seq):
n = self.histsize
add = self.add
add(None, seq[:0])
for i in range(len(seq)):
add(seq[max(0, i-n):i], seq[i:i+1])
add(seq[len(seq)-n:], None)
def get(self):
choice = self.choice
trans = self.trans
n = self.histsize
seq = choice(trans[None])
while True:
subseq = seq[max(0, len(seq)-n):]
options = trans[subseq]
next = choice(options)
if not next:
break
seq += next
return seq
def test():
import sys, random, getopt
args = sys.argv[1:]
try:
opts, args = getopt.getopt(args, '0123456789cdwq')
except getopt.error:
print('Usage: %s [-#] [-cddqw] [file] ...' % sys.argv[0])
print('Options:')
print('-#: 1-digit history size (default 2)')
print('-c: characters (default)')
print('-w: words')
print('-d: more debugging output')
print('-q: no debugging output')
print('Input files (default stdin) are split in paragraphs')
print('separated blank lines and each paragraph is split')
print('in words by whitespace, then reconcatenated with')
print('exactly one space separating words.')
print('Output consists of paragraphs separated by blank')
print('lines, where lines are no longer than 72 characters.')
sys.exit(2)
histsize = 2
do_words = False
debug = 1
for o, a in opts:
if '-0' <= o <= '-9': histsize = int(o[1:])
if o == '-c': do_words = False
if o == '-d': debug += 1
if o == '-q': debug = 0
if o == '-w': do_words = True
if not args:
args = ['-']
m = Markov(histsize, random.choice)
try:
for filename in args:
if filename == '-':
f = sys.stdin
if f.isatty():
print('Sorry, need stdin from file')
continue
else:
f = open(filename, 'r')
if debug: print('processing', filename, '...')
text = f.read()
f.close()
paralist = text.split('\n\n')
for para in paralist:
if debug > 1: print('feeding ...')
words = para.split()
if words:
if do_words:
data = tuple(words)
else:
data = ' '.join(words)
m.put(data)
except KeyboardInterrupt:
print('Interrupted -- continue with data read so far')
if not m.trans:
print('No valid input files')
return
if debug: print('done.')
if debug > 1:
for key in m.trans.keys():
if key is None or len(key) < histsize:
print(repr(key), m.trans[key])
if histsize == 0: print(repr(''), m.trans[''])
print()
while True:
data = m.get()
if do_words:
words = data
else:
words = data.split()
n = 0
limit = 72
for w in words:
if n + len(w) > limit:
print()
n = 0
print(w, end=' ')
n += len(w) + 1
print()
print()
if __name__ == "__main__":
test()