文字列のフォーマット

Zenで利用できる文字列のフォーマットを説明します。大きく分けると2つのトピックがあります。

文字列にフォーマットして出力する
文字列から変換する

1.はZenのソースコード内で文字列を作成したい場合に利用します。std.debug.warnはこちらに属します。2.は文字列のデータを、別の形式に変換する際に利用します。例えば、ユーザーから入力された文字列の"1"を整数型の1に変換する処理です。

これらの文字列を処理するコードは、std.fmtモジュール内に実装されています。std.debugモジュールやstd.ioモジュールは、std.fmtモジュールを利用して文字列をフォーマットした後、対象のファイル (標準出力、標準エラーを含む) に文字列を書き込みます。

文字列にフォーマットして出力する

本書内では、std.debug.warn関数を用いて標準エラーに文字を出力するコードを多く掲載しています。このwarn関数はテンプレート文字列とプレースホルダを埋める引数とから、フォーマットされた文字列を標準エラーに出力します。

例えば、次のようなコードを書くと、Hello Worldと標準エラーに出力されます。

    std.debug.warn("Hello {}\n", .{"World"});

ここで、最初の文字列Hello {}\nはテンプレート文字列と呼ばれます。テンプレート文字列内の{}はプレースホルダであり、テンプレート文字列に続く引数がこの位置に表示されることを意味します。引数は無名構造体リテラルとして渡します。

文字列フォーマットの制約

プレースホルダがない場合

文字列フォーマットには必ずテンプレート文字列と無名構造体リテラルのペアを指定する必要があります。そのため、次のコードのようにプレースホルダがない場合も空の無名構造体リテラルを指定する必要があります。

    std.debug.warn("Hello World\n", .{});

引数の最大個数

引数の個数は最大32個までとなっています。そのため、次のコードはコンパイルエラーとなります。

    std.debug.warn("{}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}\n",
            .{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33 });

このコードをコンパイルすると次の通りエラーが表示されます。

エラー[E02040]: 32 arguments max are supported per format call
        @compileError("32 arguments max are supported per format call");
        ~
...

このテンプレート文字列と引数から、文字列を作り出す処理は、std.fmt.format関数が担っています。

フォーマットAPI一覧

std.fmt.format関数を利用して文字列をフォーマットして出力するAPIには、以下のものがあります。

std.fmt.bufPrint: 与えたバッファにフォーマットされた文字列を出力します
std.fmt.allocPrint: 与えたアロケータからバッファを確保し、フォーマットされた文字列を出力します
std.debug.warn: 標準エラーにフォーマットされた文字列を出力します
std.debug.printError: warnと同様ですが、フォーマットされた文字列の前に赤で error: と出力します。
std.debug.printHint: warnと同様ですが、フォーマットされた文字列の前にシアンで hint: と出力します。
std.debug.printInfo: warnと同様ですが、フォーマットされた文字列の前に緑で info: と出力します。
std.fs.write.print: 出力ストリームを指定してフォーマットされた文字列を出力します。

これらの API について以降で説明します。

std.fmt.bufPrint / std.fmt.allocPrint

プログラム内でフォーマットされた文字列を扱いたい場合に利用します。

bufPrintは[]mut u8のバッファを与え、そのバッファに文字列を出力します。バッファの長さが不足する場合、EndOfStreamエラーを返します。

examples/ch13-io/format/src/format_api.zen:6:13

test "bufPrint" {
    var buf: [16]u8 = undefined;
    const formatted = try fmt.bufPrint(&mut buf, "Hello {}", .{"World"});
    equalSlices(u8, "Hello World", formatted);

    const e = fmt.bufPrint(&mut buf, "too long string for the buffer", .{});
    err(error.EndOfStream, e);
}

allocPrintはバッファの代わりにアロケータを第一引数で与えます。メモリ確保に失敗すると、OutOfMemoryエラーを返します。

std.debug.warn / std.debug.printError / std.debug.printHint / std.debug.printInfo

フォーマットされた文字列を標準エラーに出力します。 warn以外は、それぞれフォーマットされた文字列の前に着色された error:、hint:、info: の文字列を出力します。また、warn 以外の出力は自動的に改行されます。

examples/ch13-io/format/src/format_api.zen:20:25

test "std.debug" {
    std.debug.warn("\n{}\n", .{"warn"});
    std.debug.printError("print{}", .{"Error"});
    std.debug.printHint("print{}", .{"Hint"});
    std.debug.printInfo("print{}", .{"Info"});
}

このテストを実行すると次のような出力になります。ドキュメントではわかりませんが、実際には error:、hint:、info: の部分はそれぞれ赤、シアン、緑の色がついた文字で出力されます。

warn
error: printError
hint: printHint
info: printInfo

std.fs.write.print

std.fs.Fileのインスタンスを対象に、フォーマットされた文字列を出力します。例えば、標準出力にフォーマットされた文字列を出力する場合、次のようにします。

examples/ch13-io/format/src/format_api.zen:25:28

pub fn main() !void {
    const stdout = try std.fs.getStdOut();
    try std.fs.write.print(stdout, "Hello {}\n", .{"World"});
}

$ zen run format_api.zen
Hello World

プレースホルダ

引数の値を挿入する場所をプレースホルダと呼び、{}と表現します。

プレースホルダ{}は書式を指定できます。以下に書式を指定する場合の記述を示します。それぞれの項目については、以降の項で説明します。

{[引数位置][書式指定]:[フィル文字][配置][文字幅].[桁数]}

プレースホルダ{}はテンプレート文字列内に複数個置くことができます。プレースホルダには引数に指定している無名構造体リテラルのフィールドの値が、順に挿入されます。

examples/ch13-io/format/src/placeholder.zen:3:6

test "placeholder" {
    // output: 0, one, 2.0e+00
    std.debug.warn("{}, {}, {}\n", .{ @to(u32, 0), "one", @to(f64, 2.0) });
}

{}のエスケープ

プレースホルダで利用される {} を文字として出力するには、{{、}} と書きます。

examples/ch13-io/format/src/placeholder.zen:159:164

test "escape placeholder" {
    // output: {
    std.debug.warn("{{\n", .{});
    // output: }
    std.debug.warn("}}\n", .{});
}

デフォルトのフォーマット

Zenのプリミティブ型は、標準ライブラリ内でデフォルトのフォーマットで出力されます。

構造体は、次のように各フィールドが出力されます。後述するformatメソッドを実装することで、フォーマットを定義することも可能です。

examples/ch13-io/format/src/placeholder.zen:68:77

test "format struct print" {
    const Struct = struct {
        x: f64,
        y: f64,
    };

    const s = Struct{ .x = 1.0, .y = 2.0 };
    // output: Struct{ .x = 1.0e+00, .y = 2.0e+00 }
    std.debug.warn("{}\n", .{s});
}

タグ付き共用体も同様に、ヴァリアントとその値が出力されます。

examples/ch13-io/format/src/placeholder.zen:79:93

test "format tagged union" {
    const Value = union(enum) {
        String: []u8,
        F64: f64,
        U32: u32,
    };

    const string = Value{ .String = "hello" };
    // output: Value{ .String = hello }
    std.debug.warn("{}\n", .{string});

    const float = Value{ .F64 = 1.2345 };
    // output: Value{ .F64 = 1.2345e+00 }
    std.debug.warn("{}\n", .{float});
}

エラー共用体は、エラー型の値が格納されている場合はエラーの種別、そうでない場合、格納されている値が出力されます。

examples/ch13-io/format/src/placeholder.zen:95:107

test "format error union print" {
    const Error = error{
        AnError,
    };

    const err: Error!u32 = Error.AnError;
    // output: error.AnError
    std.debug.warn("{}\n", .{err});

    const value: Error!u32 = @to(u32, 1);
    // output: 1
    std.debug.warn("{}\n", .{value});
}

ポインタはプリミティブ型と構造体とで、振る舞いが異なります。プリミティブ型は{}で型名@16進数表記アドレスの形式で出力されます。

examples/ch13-io/format/src/placeholder.zen:109:114

test "format pointer print" {
    const x: u32 = 0;
    // `@`に続くアドレスは実行ごとに異なります
    // output: u32@20c65c
    std.debug.warn("{}\n", .{&x});
}

構造体のポインタ型は、{}では通常の構造体の値と同じ出力が、{*}で型名@16進数表記アドレスの形式で出力されます。

examples/ch13-io/format/src/placeholder.zen:116:127

test "format struct pointer print" {
    const Struct = struct{
        x: f64,
        y: f64,
    };

    const s = Struct{ .x = 1.0, .y = 2.0 };
    // output: Struct{ .x = 1.0e+00, .y = 2.0e+00 }
    std.debug.warn("{}\n", .{&s});
    // output: Struct@20c230
    std.debug.warn("{*}\n", .{&s});
}

引数位置

{n}でn番目の位置にある引数を指定します。引数は0から数えます。

examples/ch13-io/format/src/placeholder.zen:8:14

test "position" {
    // output: zero, one, zero, one
    std.debug.warn("{}, {}, {0}, {1}\n", .{ "zero", "one" });

    // output: zero, one, one, zero
    std.debug.warn("{}, {}, {1}, {0}\n", .{ "zero", "one" });
}

文字幅

{:n}で文字列の最小幅nを指定します。配置を指定しない場合、左寄せになります。

examples/ch13-io/format/src/placeholder.zen:16:19

test "width" {
    // output: |ABC  |1    |ABC|
    std.debug.warn("|{:5}|{:5}|{:1}|\n", .{ "ABC", @to(u32, 1), "ABC" });
}

配置

表示する文字数が指定した桁数に満たない場合の配置 (左寄せ、中央、右寄せ) を指定します。: の後に <, ^, > をつけることで指定します。配置オプションは桁数を指定している場合のみ有効となります。

examples/ch13-io/format/src/placeholder.zen:139:147

test "format placeholder alignment" {
    // output:
    // |123  |1    |
    // | 123 |  1  |
    // |  123|    1|
    std.debug.warn("|{:<5}|{:<5}|\n", .{ @to(u32, 123), @to(u32, 1) });
    std.debug.warn("|{:^5}|{:^5}|\n", .{ @to(u32, 123), @to(u32, 1) });
    std.debug.warn("|{:>5}|{:>5}|\n", .{ @to(u32, 123), @to(u32, 1) });
}

フィル文字

配置を指定した際に空き領域を埋める文字を指定します。: と配置オプション <, ^, > の間に空きを埋める文字を1文字指定します。フィル文字のオプションは桁数と配置を指定している場合のみ有効となります。

examples/ch13-io/format/src/placeholder.zen:149:157

test "format placeholder fill" {
    // output:
    // |123--|1----|
    // |-123-|--1--|
    // |--123|----1|
    std.debug.warn("|{:-<5}|{:-<5}|\n", .{ @to(u32, 123), @to(u32, 1) });
    std.debug.warn("|{:-^5}|{:-^5}|\n", .{ @to(u32, 123), @to(u32, 1) });
    std.debug.warn("|{:->5}|{:->5}|\n", .{ @to(u32, 123), @to(u32, 1) });
}

数値の桁数

{:.n}で数値を表示する桁数nを指定します。

浮動小数点数の場合は、小数点以下第何位まで出力するかを指定します。

整数の場合は、足りない桁を0埋めで表示します。nよりも桁数が大きい場合は、全て表示します。

examples/ch13-io/format/src/placeholder.zen:21:36

test "precision" {
    // output: 3.14159e+00
    std.debug.warn("{:.5}\n", .{@to(f64, 3.14159265359)});

    // output: 1.0013e+02
    std.debug.warn("{:.4}\n", .{@to(f64, 100.125)});

    // output: 00123
    std.debug.warn("{:.5}\n", .{@to(u32, 123)});

    // output: 123
    std.debug.warn("{:.1}\n", .{@to(u32, 123)});

    // output: 00000abc
    std.debug.warn("{x:.8}\n", .{@to(u32, 0xabc)});
}

書式指定

プレースホルダ{}内には、文字列を出力する際の書式指定を記述することが可能です。

examples/ch13-io/format/src/placeholder.zen:38:66

test "format specifiers" {
    // {c}: 一文字のASCII文字として表示
    // output: A
    std.debug.warn("{c}\n", .{@to(u8, 65)});

    // {b}: 2進数として表示
    // output: 11110001001000000
    std.debug.warn("{b}\n", .{@to(u32, 123456)});

    // {x}: 小文字の16進数として表示
    // output: 1e240
    std.debug.warn("{x}\n", .{@to(u32, 123456)});

    // {X}: 大文字の16進数として表示
    // output: 1E240
    std.debug.warn("{X}\n", .{@to(u32, 123456)});

    // {e}: 浮動小数点を指数形式で表示
    // output: 1.23456e+02
    std.debug.warn("{e}\n", .{@to(f64, 123.456)});

    // {d}: 整数 / 浮動小数点を10進数として表示
    // output: 123.456
    std.debug.warn("{d}\n", .{@to(f64, 123.456)});

    // {s}: NULL文字終端された文字列を表示
    // output: hello
    std.debug.warn("{s}\n", .{"hello"});
}

構造体フォーマットのユーザー定義

構造体のフォーマット出力を自分で定義することも可能です。その場合、構造体に所定のシグネチャを持つpubで公開されたformatメソッドを定義します。

例えば、次のような座標値xとyをフィールドにもつPoint構造体を、(x, y) = (xの値, yの値)のような形式でフォーマット出力したいとします。

const Point = struct {
    x: f64, y: f64,
};

formatメソッドの実装は、次のようになります。引数が多くてわかりずらい部分がありますが、重要なことは、プレースホルダーの情報が渡されるので、その情報を元にstd.fmt.formatを呼び出す、ということです。

examples/ch13-io/format/src/user_define.zen:4:22

const Point = struct {
    x: f64,
    y: f64,

    pub fn format(
        self: Point,
        comptime fmt_spec: []u8,
        options: fmt.FormatOptions,
        comptime Errors: type,
        out_stream: std.io.OutStream(Errors),
    ) Errors!void {
        return fmt.format(
            Errors,
            out_stream,
            "(x, y) = ({d:.3}, {d:.3})", // テンプレート文字列
            .{ self.x, self.y },
        );
    }
};

std.fmt.formatの第3引数はテンプレート文字列で、第4引数はテンプレート文字列のプレースホルダーを埋めるための引数です。

このPoint構造体の値をフォーマット出力すると、自身で定義した形式でフォーマットされて出力されます。

examples/ch13-io/format/src/user_define.zen:21:25

test "user defined format" {
    const point = Point{ .x = 1.0, .y = 2.0 };
    // output: (x, y) = (1.000, 2.000)
    std.debug.warn("{}\n", .{point});
}

文字列から変換する

文字列を解析し、整数型や浮動小数点型に変換します。コマンドラインから入力を受け付けたり、ファイルからデータを読み取る場合に利用できます。これを実現する標準ライブラリ関数は、std.fmtモジュールにある次の関数です。

parseInt: 符号付き整数の形式で表現された文字列を、任意の整数型に変換します
parseUnsigned: 符号なし整数の形式で表現された文字列を、任意の整数型に変換します
parseFloat: 文字列を任意の浮動小数点型に変換します

parseInt / parseUnsigned

parseIntの関数シグネチャは、fn parseInt(comptime T: type, buf: []u8, radix: u8) !Tです。第一引数がターゲットとする整数型、第二引数が文字列スライス、第三引数が基数です。基数は、2〜36進数の中から選択できます。

parseIntで正常に整数値が得られる使用例は次の通りです。ターゲットとする整数型には、符号あり / 符号なしいずれも選択でき、任意ビット幅を指定できます。また、数字の前に+ / -が入っていてもかまいません

examples/ch13-io/format/src/parse_int.zen:6:27

test "parseInt" {
    const a = try fmt.parseInt(i32, "10", 10);
    ok(a == 10);

    const b = try fmt.parseInt(i32, "+10", 10);
    ok(b == 10);

    const c = try fmt.parseInt(i32, "-10", 10);
    ok(c == -10);

    const d = try fmt.parseInt(u32, "10", 10);
    ok(d == 10);

    const e = try fmt.parseInt(u32, "1001", 2);
    ok(e == 9);

    const f = try fmt.parseInt(u32, "FF", 16);
    ok(f == 255);

    const g = try fmt.parseInt(u5, "31", 10);
    ok(g == 31);
}

ターゲットとする整数型が符号なしの場合には、-が入った文字列は変換が失敗し、error.InvalidCharacterが返ってきます。スペース () を含む、数値として変換できない文字列がある場合も、error.InvalidCharacterが返ってきます。文字列を変換した結果が、ターゲットとする整数型内に収まらない場合も変換に失敗し、error.Overflowが返ってきます。

examples/ch13-io/format/src/parse_int.zen:29:38

test "fail to parseInt" {
    const invalid_minus_sign = fmt.parseInt(u32, "-10", 10);
    err(error.InvalidCharacter, invalid_minus_sign);

    const invalid_char = fmt.parseInt(u8, " 10", 10);
    err(error.InvalidCharacter, invalid_char);

    const overflow = fmt.parseInt(u8, "256", 10);
    err(error.Overflow, overflow);
}

parseUnsignedは-から始まる文字列を受け付けないことを除き、parseIntと同等です。

parseFloat

parseFloatの関数シグネチャは、fn parseFloat(comptime T: type, s: []u8) !Tです。第一引数はターゲットとする浮動小数点型で、f16 / f32 / f64 / f128のいずれかです。第二引数は文字列スライスです。

parseFloatの利用例を以下に示します。文字列は指数表現でも変換できます。nanや+inf / -infも変換可能です。

examples/ch13-io/format/src/parse_int.zen:40:74

test "parseFloat" {
    const approxEq = std.math.approxEq;
    const epsilon = 1e-7;

    // ターゲットとして指定できる浮動小数点型
    inline for ([_]type{ f16, f32, f64, f128 }) |floatType| {
        const zero = try fmt.parseFloat(floatType, "0");
        ok(zero == 0.0);

        const plus_zero = try fmt.parseFloat(floatType, "+0");
        const minus_zero = try fmt.parseFloat(floatType, "-0");
        ok(plus_zero == 0.0);
        ok(minus_zero == 0.0);

        const pi = try fmt.parseFloat(floatType, "3.141");
        const minus_pi = try fmt.parseFloat(floatType, "-3.141");
        ok(approxEq(floatType, pi, 3.141, epsilon));
        ok(approxEq(floatType, minus_pi, -3.141, epsilon));

        // 指数表現
        const exp = try fmt.parseFloat(floatType, "1.23456e+2");
        ok(approxEq(floatType, exp, 123.456, epsilon));
    }

    // NaNも変換可能
    // NaN同士は比較できないため、ビットレベルで同じことをテスト
    const nan = try fmt.parseFloat(f64, "NaN");
    ok(@bitCast(u64, nan) == @bitCast(u64, std.math.nan(f64)));

    // `inf` / `-inf`も変換可能
    const inf = try fmt.parseFloat(f64, "inf");
    const minus_inf = try fmt.parseFloat(f64, "-inf");
    ok(inf == std.math.inf(f64));
    ok(minus_inf == -std.math.inf(f64));
}

Zen

文字列のフォーマット

文字列にフォーマットして出力する

文字列フォーマットの制約

プレースホルダがない場合

引数の最大個数

フォーマットAPI一覧

std.fmt.bufPrint / std.fmt.allocPrint

std.debug.warn / std.debug.printError / std.debug.printHint / std.debug.printInfo

std.fs.write.print

プレースホルダ

{}のエスケープ

デフォルトのフォーマット

引数位置

文字幅

配置

フィル文字

数値の桁数

書式指定

構造体フォーマットのユーザー定義

文字列から変換する

parseInt / parseUnsigned

parseFloat

Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Chapter 10

Chapter 11

Chapter 12

Chapter 13

Chapter 14

Chapter 15

Appendix

Error Explanation