\input gets slower after each \input
bnb at tug.org
Tue Apr 27 17:34:38 CEST 2021
On Tue, 27 Apr 2021, Phelype Oleinik wrote:
> On 27/04/2021 04:38, Ulrike Fischer wrote:
>> Am Mon, 26 Apr 2021 20:19:51 -0300 schrieb Phelype Oleinik:
>>> Dear list,
>>> I was doing some tests with file reading and I found that \input gets
>>> a tiny bit slower each time it is used. Why is that?
>>> I made a test document that shows this behaviour.
>> miktex dies with ! TeX capacity exceeded, sorry [pool size=3208635]
>> on your example.
>> Looking at the stats, one can see that every input adds to pool:
>> 310211 string characters out of 6150784
>> 620211 string characters out of 6150784
> Yes, it does because TeX has to remember the file stack for error
> reporting, so names are kept on the string pool.
> (in fact, my example also explodes also on TeX Live because I added
> twice too many |\test| in the test document, but I didn't notice
> because of |\batchmode|. Sorry about that.)
> I knew about |\input| taking space in the string pool but I didn't think
> that would be the cause because memory access takes constant time.
> However this stackexchange post was brought to my attention:
> in which the user ShreevatsaR does an in-depth debugging to find out
> that the system-dependent changes actually implement a string search,
> which then takes the linear time showed in that example.
> I do wonder nowadays how good is the tradeoff between saving some string
> memory compared to the speed penalty... Is the string space saved only
> when identical file names are input? If so, I don't think that happens
> often enough to justify the effort, does it?
This has come up before, although with situations involving different
input file names, not the same name as in the present test. When a
file access is completed (definitively closed), ur-TeX would attempt
to remove the name from the string pool. But if anything else had
been added to the string pool, the name was not removed, since the
complications of garbage collection outweighed the space saving. I
don't think that approach has changed, and I don't remember that there
is an effective way to definitively close an \input file (although
that may be irrelevant, since newly input files often add items to the
string pool). The actual production incident was solved by combining
the files to be input, so that there were fewer files actually \input.
More information about the tex-live