annotate src/org/tmatesoft/hg/internal/EncodingHelper.java @ 510:90093ee56c0d

Full-fledged test repo to follow file history. Investigating iteration direction alternatives (from new to old in addition to existing old to new)
author Artem Tikhomirov <tikhomirov.artem@gmail.com>
date Thu, 13 Dec 2012 15:46:40 +0100 (2012-12-13)
parents 909306e412e2
children 0be5be8d57e9
rev   line source
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
1 /*
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
2 * Copyright (c) 2011-2012 TMate Software Ltd
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
3 *
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
4 * This program is free software; you can redistribute it and/or modify
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
5 * it under the terms of the GNU General Public License as published by
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
6 * the Free Software Foundation; version 2 of the License.
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
7 *
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
8 * This program is distributed in the hope that it will be useful,
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
9 * but WITHOUT ANY WARRANTY; without even the implied warranty of
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
10 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
11 * GNU General Public License for more details.
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
12 *
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
13 * For information on how to redistribute this software under
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
14 * the terms of a license other than GNU General Public License
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
15 * contact TMate Software at support@hg4j.com
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
16 */
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
17 package org.tmatesoft.hg.internal;
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
18
456
909306e412e2 Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 418
diff changeset
19 import static org.tmatesoft.hg.util.LogFacility.Severity.Error;
909306e412e2 Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 418
diff changeset
20
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
21 import java.nio.ByteBuffer;
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
22 import java.nio.CharBuffer;
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
23 import java.nio.charset.CharacterCodingException;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
24 import java.nio.charset.Charset;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
25 import java.nio.charset.CharsetDecoder;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
26 import java.nio.charset.CharsetEncoder;
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
27
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
28 import org.tmatesoft.hg.core.SessionContext;
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
29
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
30 /**
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
31 * Keep all encoding-related issues in the single place
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
32 * NOT thread-safe (encoder and decoder requires synchronized access)
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
33 * @author Artem Tikhomirov
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
34 * @author TMate Software Ltd.
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
35 */
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
36 public class EncodingHelper {
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
37 // XXX perhaps, shall not be full of statics, but rather an instance coming from e.g. HgRepository?
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
38 /*
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
39 * To understand what Mercurial thinks of UTF-8 and Unix byte approach to names, see
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
40 * http://mercurial.808500.n3.nabble.com/Unicode-support-request-td3430704.html
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
41 */
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
42
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
43 private final SessionContext sessionContext;
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
44 private final CharsetEncoder encoder;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
45 private final CharsetDecoder decoder;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
46
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
47 EncodingHelper(Charset fsEncoding, SessionContext ctx) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
48 sessionContext = ctx;
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
49 decoder = fsEncoding.newDecoder();
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
50 encoder = fsEncoding.newEncoder();
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
51 }
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
52
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
53 /**
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
54 * Translate file names from manifest to amazing Unicode string
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
55 */
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
56 public String fromManifest(byte[] data, int start, int length) {
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
57 return decodeWithSystemDefaultFallback(data, start, length);
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
58 }
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
59
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
60 /**
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
61 * @return byte representation of the string directly comparable to bytes in manifest
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
62 */
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
63 public byte[] toManifest(String s) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
64 if (s == null) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
65 // perhaps, can return byte[0] in this case?
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
66 throw new IllegalArgumentException();
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
67 }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
68 try {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
69 // synchonized(encoder) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
70 ByteBuffer bb = encoder.encode(CharBuffer.wrap(s));
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
71 // }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
72 byte[] rv = new byte[bb.remaining()];
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
73 bb.get(rv, 0, rv.length);
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
74 return rv;
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
75 } catch (CharacterCodingException ex) {
456
909306e412e2 Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 418
diff changeset
76 sessionContext.getLog().dump(getClass(), Error, ex, String.format("Use of charset %s failed, resort to system default", charset().name()));
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
77 // resort to system-default
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
78 return s.getBytes();
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
79 }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
80 }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
81
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
82 /**
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
83 * Translate file names from dirstate to amazing Unicode string
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
84 */
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
85 public String fromDirstate(byte[] data, int start, int length) {
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
86 return decodeWithSystemDefaultFallback(data, start, length);
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
87 }
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
88
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
89 private String decodeWithSystemDefaultFallback(byte[] data, int start, int length) {
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
90 try {
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
91 return decoder.decode(ByteBuffer.wrap(data, start, length)).toString();
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
92 } catch (CharacterCodingException ex) {
456
909306e412e2 Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 418
diff changeset
93 sessionContext.getLog().dump(getClass(), Error, ex, String.format("Use of charset %s failed, resort to system default", charset().name()));
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
94 // resort to system-default
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
95 return new String(data, start, length);
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
96 }
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
97 }
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
98
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
99 private Charset charset() {
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
100 return encoder.charset();
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
101 }
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
102
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
103 }