annotate src/org/tmatesoft/hg/internal/EncodingHelper.java @ 600:46f29b73e51e

Utilize RevisionLookup to speed-up getRevisionIndex of both manifest and changelog
author Artem Tikhomirov <tikhomirov.artem@gmail.com>
date Fri, 03 May 2013 17:03:31 +0200
parents 47b7bedf0569
children fba85bc1dfb8
rev   line source
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
1 /*
526
2f9ed6bcefa2 Initial support for Revert command with accompanying minor refactoring
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 525
diff changeset
2 * Copyright (c) 2011-2013 TMate Software Ltd
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
3 *
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
4 * This program is free software; you can redistribute it and/or modify
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
5 * it under the terms of the GNU General Public License as published by
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
6 * the Free Software Foundation; version 2 of the License.
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
7 *
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
8 * This program is distributed in the hope that it will be useful,
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
9 * but WITHOUT ANY WARRANTY; without even the implied warranty of
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
10 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
11 * GNU General Public License for more details.
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
12 *
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
13 * For information on how to redistribute this software under
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
14 * the terms of a license other than GNU General Public License
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
15 * contact TMate Software at support@hg4j.com
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
16 */
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
17 package org.tmatesoft.hg.internal;
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
18
456
909306e412e2 Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 418
diff changeset
19 import static org.tmatesoft.hg.util.LogFacility.Severity.Error;
909306e412e2 Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 418
diff changeset
20
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
21 import java.nio.ByteBuffer;
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
22 import java.nio.CharBuffer;
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
23 import java.nio.charset.CharacterCodingException;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
24 import java.nio.charset.Charset;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
25 import java.nio.charset.CharsetDecoder;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
26 import java.nio.charset.CharsetEncoder;
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
27
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
28 import org.tmatesoft.hg.core.SessionContext;
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
29
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
30 /**
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
31 * Keep all encoding-related issues in the single place
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
32 * NOT thread-safe (encoder and decoder requires synchronized access)
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
33 * @author Artem Tikhomirov
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
34 * @author TMate Software Ltd.
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
35 */
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
36 public class EncodingHelper {
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
37 // XXX perhaps, shall not be full of statics, but rather an instance coming from e.g. HgRepository?
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
38 /*
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
39 * To understand what Mercurial thinks of UTF-8 and Unix byte approach to names, see
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
40 * http://mercurial.808500.n3.nabble.com/Unicode-support-request-td3430704.html
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
41 */
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
42
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
43 private final SessionContext sessionContext;
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
44 private final CharsetEncoder encoder;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
45 private final CharsetDecoder decoder;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
46
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
47 EncodingHelper(Charset fsEncoding, SessionContext ctx) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
48 sessionContext = ctx;
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
49 decoder = fsEncoding.newDecoder();
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
50 encoder = fsEncoding.newEncoder();
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
51 }
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
52
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
53 /**
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
54 * Translate file names from manifest to amazing Unicode string
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
55 */
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
56 public String fromManifest(byte[] data, int start, int length) {
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
57 return decodeWithSystemDefaultFallback(data, start, length);
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
58 }
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
59
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
60 /**
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
61 * @return byte representation of the string directly comparable to bytes in manifest
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
62 */
526
2f9ed6bcefa2 Initial support for Revert command with accompanying minor refactoring
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 525
diff changeset
63 public byte[] toManifest(CharSequence s) {
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
64 if (s == null) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
65 // perhaps, can return byte[0] in this case?
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
66 throw new IllegalArgumentException();
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
67 }
525
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
68 return encodeWithSystemDefaultFallback(s);
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
69 }
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
70
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
71 /**
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
72 * Translate file names from dirstate to amazing Unicode string
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
73 */
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
74 public String fromDirstate(byte[] data, int start, int length) {
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
75 return decodeWithSystemDefaultFallback(data, start, length);
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
76 }
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
77
526
2f9ed6bcefa2 Initial support for Revert command with accompanying minor refactoring
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 525
diff changeset
78 public byte[] toDirstate(CharSequence fname) {
525
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
79 if (fname == null) {
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
80 throw new IllegalArgumentException();
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
81 }
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
82 return encodeWithSystemDefaultFallback(fname);
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
83 }
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
84
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
85 private String decodeWithSystemDefaultFallback(byte[] data, int start, int length) {
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
86 try {
525
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
87 return decoder.decode(ByteBuffer.wrap(data, start, length)).toString();
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
88 } catch (CharacterCodingException ex) {
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
89 sessionContext.getLog().dump(getClass(), Error, ex, String.format("Use of charset %s failed, resort to system default", charset().name()));
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
90 // resort to system-default
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
91 return new String(data, start, length);
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
92 }
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
93 }
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
94
526
2f9ed6bcefa2 Initial support for Revert command with accompanying minor refactoring
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 525
diff changeset
95 private byte[] encodeWithSystemDefaultFallback(CharSequence s) {
525
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
96 try {
0be5be8d57e9 Repository checkout support, first iteration
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 456
diff changeset
97 // synchronized(encoder) {
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
98 ByteBuffer bb = encoder.encode(CharBuffer.wrap(s));
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
99 // }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
100 byte[] rv = new byte[bb.remaining()];
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
101 bb.get(rv, 0, rv.length);
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
102 return rv;
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
103 } catch (CharacterCodingException ex) {
456
909306e412e2 Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 418
diff changeset
104 sessionContext.getLog().dump(getClass(), Error, ex, String.format("Use of charset %s failed, resort to system default", charset().name()));
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
105 // resort to system-default
526
2f9ed6bcefa2 Initial support for Revert command with accompanying minor refactoring
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 525
diff changeset
106 return s.toString().getBytes();
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
107 }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
108 }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
109
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
110 private Charset charset() {
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
111 return encoder.charset();
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
112 }
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
113
527
47b7bedf0569 Tests for present HgCheckoutCommand functionality. Update branch information on checkout. Use UTF8 encoding for the branch file
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 526
diff changeset
114 public static Charset getUTF8() {
47b7bedf0569 Tests for present HgCheckoutCommand functionality. Update branch information on checkout. Use UTF8 encoding for the branch file
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 526
diff changeset
115 return Charset.forName("UTF-8");
47b7bedf0569 Tests for present HgCheckoutCommand functionality. Update branch information on checkout. Use UTF8 encoding for the branch file
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 526
diff changeset
116 }
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
117 }