annotate design.txt @ 53:a6f39e595b2b wrap-data-access

Merged changes from default branch
author Artem Tikhomirov <tikhomirov.artem@gmail.com>
date Sun, 16 Jan 2011 01:40:38 +0100
parents 26e3eeaa3962
children 05829a70b30b
rev   line source
1
a3576694a4d1 Repository detection from local/specified directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
1 FileStructureWalker (pass HgFile, HgFolder to callable; which can ask for VCS data from any file)
a3576694a4d1 Repository detection from local/specified directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
2 External uses: user browses files, selects one and asks for its history
a3576694a4d1 Repository detection from local/specified directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
3 Params: tip/revision;
a3576694a4d1 Repository detection from local/specified directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
4 Implementation: manifest
a3576694a4d1 Repository detection from local/specified directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
5
a3576694a4d1 Repository detection from local/specified directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
6 Log --rev
a3576694a4d1 Repository detection from local/specified directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
7 Log <file>
2
08db726a0fb7 Shaping out low-level Hg structures
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 1
diff changeset
8 HgDataFile.history() or Changelog.history(file)?
08db726a0fb7 Shaping out low-level Hg structures
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 1
diff changeset
9
08db726a0fb7 Shaping out low-level Hg structures
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 1
diff changeset
10
08db726a0fb7 Shaping out low-level Hg structures
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 1
diff changeset
11 Changelog.all() to return list with placeholder, not-parsed elements (i.e. read only compressedLen field and skip to next record), so that
08db726a0fb7 Shaping out low-level Hg structures
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 1
diff changeset
12 total number of elements in the list is correct
1
a3576694a4d1 Repository detection from local/specified directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
13
a3576694a4d1 Repository detection from local/specified directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
14 hg cat
2
08db726a0fb7 Shaping out low-level Hg structures
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 1
diff changeset
15 Implementation: logic to find file by name in the repository is the same with Log and other commands
08db726a0fb7 Shaping out low-level Hg structures
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 1
diff changeset
16
08db726a0fb7 Shaping out low-level Hg structures
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 1
diff changeset
17
08db726a0fb7 Shaping out low-level Hg structures
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 1
diff changeset
18 Revlog
4
aa1912c70b36 Fix offset issue for inline revlogs. Commandline processing.
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 2
diff changeset
19 What happens when big entry is added to a file - when it detects it can't longer fit into .i and needs .d? Inline flag and .i format changes?
aa1912c70b36 Fix offset issue for inline revlogs. Commandline processing.
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 2
diff changeset
20
22
603806cd2dc6 Status of local working dir against non-tip base revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 21
diff changeset
21 What's hg natural way to see nodeids of specific files (i.e. when I do 'hg --debug manifest -r 11' and see nodeid of some file, and
603806cd2dc6 Status of local working dir against non-tip base revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 21
diff changeset
22 then would like to see what changeset this file came from)?
4
aa1912c70b36 Fix offset issue for inline revlogs. Commandline processing.
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 2
diff changeset
23
aa1912c70b36 Fix offset issue for inline revlogs. Commandline processing.
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 2
diff changeset
24 ----------
6
5abe5af181bd Ant script to build commands and run sample
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 5
diff changeset
25 + support patch from baseRev + few deltas (although done in a way patches are applied one by one instead of accumulated)
4
aa1912c70b36 Fix offset issue for inline revlogs. Commandline processing.
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 2
diff changeset
26 + command-line samples (-R, filenames) (Log & Cat) to show on any repo
6
5abe5af181bd Ant script to build commands and run sample
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 5
diff changeset
27 +buildfile + run samples
9
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
28 *input stream impl + lifecycle. Step forward with FileChannel and ByteBuffer, although questionable accomplishment (looks bit complicated, cumbersome)
14
442dc6ee647b Show correct time
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 11
diff changeset
29 + dirstate.mtime
43
1b26247d7367 Calculate result length of the patch operarion, when unknown
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 41
diff changeset
30 +calculate sha1 digest for file to see I can deal with nodeid. +Do this correctly (smaller nodeid - first)
18
02ee376bee79 status operation against current working directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 17
diff changeset
31 *.hgignored processing
25
da8ccbfae64d Reflect Nodeid's array is exactly 20
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 22
diff changeset
32 +Nodeid to keep 20 bytes always, Revlog.Inspector to get nodeid array of meaningful data exact size (nor heading 00 bytes, nor 12 extra bytes from the spec)
26
71a9ba42cee8 Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 25
diff changeset
33 +DataAccess - implement memory mapped files,
49
26e3eeaa3962 branch and user filtering for log operation
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 43
diff changeset
34 +Changeset to get index (local revision number)
4
aa1912c70b36 Fix offset issue for inline revlogs. Commandline processing.
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 2
diff changeset
35
26
71a9ba42cee8 Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 25
diff changeset
36 DataAccess - collect debug info (buffer misses, file size/total read operations) to find out better strategy to buffer size detection. Compare performance.
4
aa1912c70b36 Fix offset issue for inline revlogs. Commandline processing.
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 2
diff changeset
37 delta merge
14
442dc6ee647b Show correct time
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 11
diff changeset
38 RevisionWalker (on manifest) and WorkingCopyWalker (io.File) talking to ? and/or dirstate
33
565ce0835674 TODO added, to try stream for unzip in revlog
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 26
diff changeset
39 RevlogStream - Inflater. Perhaps, InflaterStream instead?
41
858d1b2458cb Check integrity for bundle changelog. Sort nodeids when calculating hash
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 33
diff changeset
40 Implement use of fncache (use names from it - perhaps, would help for Mac issues Alex mentioned) along with 'digest'-ing long file names
858d1b2458cb Check integrity for bundle changelog. Sort nodeids when calculating hash
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 33
diff changeset
41
858d1b2458cb Check integrity for bundle changelog. Sort nodeids when calculating hash
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 33
diff changeset
42
20
11cfabe692b3 Status operation for two repository revisions (no local dir involved)
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 18
diff changeset
43
18
02ee376bee79 status operation against current working directory
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 17
diff changeset
44 Status operation from GUI - guess, usually on a file/subfolder, hence API should allow for starting path (unlike cmdline, seems useless to implement include/exclide patterns - GUI users hardly enter them, ever)
9
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
45
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
46
15
865bf07f381f Basic hgignore handling
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 14
diff changeset
47 ??? encodings of fncache, .hgignore, dirstate
16
254078595653 Print manifest nodeid
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 15
diff changeset
48 ??? http://mercurial.selenic.com/wiki/Manifest says "Multiple changesets may refer to the same manifest revision". To me, each changeset
254078595653 Print manifest nodeid
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 15
diff changeset
49 changes repository, hence manifest should update nodeids of the files it lists, effectively creating new manifest revision.
15
865bf07f381f Basic hgignore handling
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 14
diff changeset
50
9
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
51 >>>> Effective file read/data access
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
52 ReadOperation, Revlog does: repo.getFileSystem().run(this.file, new ReadOperation(), long start=0, long end = -1)
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
53 ReadOperation gets buffer (of whatever size, as decided by FS impl), parses it and then reports if needs more data.
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
54 This helps to ensure streams are closed after reading, allows caching (if the same file (or LRU) is read few times in sequence)
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
55 and allows buffer management (i.e. reuse. Single buffer for all reads).
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
56 Scheduling multiple operations (in future, to deal with writes - single queue for FS operations - no locks?)
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
57
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
58 File access:
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
59 * NIO and mapped files - should be fast. Although seems to give less control on mem usage.
21
e929cecae4e1 Refactor to move revlog content to base class
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 20
diff changeset
60 * Regular InputStreams and chunked stream on top - allocate List<byte[]>, each (but last) chunk of fixed size (depending on initial file size)
9
d6d2a630f4a6 Access to underlaying file data wrapped into own Access object, implemented with FileChannel and ByteBuffer
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 6
diff changeset
61
26
71a9ba42cee8 Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 25
diff changeset
62 <<<<<
71a9ba42cee8 Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 25
diff changeset
63
71a9ba42cee8 Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 25
diff changeset
64 Tests:
71a9ba42cee8 Memory-mapped files for bigger files. Defect reading number of bytes greater than size of the buffer fixed
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 25
diff changeset
65 DataAccess - readBytes(length > memBufferSize, length*2 > memBufferSize) - to check impl is capable to read huge chunks of data, regardless of own buffer size