Workfile Of The Programm

Read Complete Research Material

WORKFILE OF THE PROGRAMM

Work file of the program

Work file of the program

Introduction

Eviews is a popular proprietary econometrics program. It is widely used in teaching, and in various places around the Internet one can find datasets made available by publishers of textbooks or by professors in the Eviews workfile format. It struck me that it would be useful if gretl could read this format. There does not appear to be any publicly available specification (not surprising for a proprietary binary format), so I decided to try reverse-engineering. This document sets out my findings. The findings are based on examination of several workfiles from different sources and dates (using Emacs in hexl mode, the strings program, and an exploratory reader program written in C), but I have no idea how general they are. I welcome any corrections or additions.

Overview of format

An Eviews workfile starts with an identifying string, "New MicroTSP Workfile," which seems to be padded out to 24 bytes with NUL characters. This is followed by a header of variable size, but within which certain key information seems to occur at fixed offsets. Then comes a series of 70-byte records, containing information on the data series in the file (and possibly information on other objects in some cases?). The central section of the file contains blocks of actual data, stored as doubles, and other information on the variables. The stream positions of these blocks are given in the preceding 70-byte records. All numbers seem to be stored in little-endian byte order. The examples I have seen also have a substantial swathe of NUL bytes in the central section. The file ends with a trailer section that includes the name of the file and strings representing the starting and ending observations.

Header section

As mentioned above, the header is of variable size. I'm unsure of exactly where the header ends and the series of 70-byte records begins, so I don't know the exact size of the header in any instance, but a common size seems to be 144 or 146 bytes (excluding the leading 24 bytes). In some files I've looked at the header is 32 bytes larger than this. The fields within the header that appear to be fixed are shown below (byte offsets are decimal and relative to the start of the file; lengths are in bytes). As you'll see, there's a lot here that doesn't yet make sense to me.

offset

length

comment

0

80

???

80

8

long: size of header

88

26

???

114

4

int: number of variables + 1

118

4

time_t: date of last modification of the file (or zero)

122

2

???

124

2

short: data frequency (e.g. 1 for annual, 4 for quarterly)

126

2

short: starting subperiod (for, e.g., quarterly or monthly data) or zero

128

4

int: starting observation (e.g. year)

132

8

???

140

4

int: number of observations

144

variable

??? (mostly NULs)

The long at offset 80 gives a number that is closely related to the stream position of the start of the series of 70-byte variable records. For example, in some files the value is 144. Add 24 bytes for the initial identifier and you get ...