海外講演の英語字幕を整形する秀丸マクロ
例えば、YouTube の GDC の講演動画で英語字幕を表示した際には「自然な位置での改行」「ピリオド・カンマ」が入っておらず、また全文がほぼ小文字になっているため、非常に読みにくいです。
そこで、テキストの整形のために秀丸マクロを組んでみました。
(普段、文章やリストの整形に秀丸をよく使っているので)
ちなみに YouTube の字幕は Language Reactor でもダウンロードできます。
Language Reactor - Chrome ウェブストア
今回マクロを試した際の講演動画はこちらです。
Visual Effects Bootcamp: The Rise of Realtime - YouTube
▼マクロ実行前(YouTubeの字幕そのまま)
▼マクロ実行後
これでも余計な置換や改行が増えますが、元の文章よりかなり読みやすくなるかと思います。
あとは講演を聞きながら多少の修正を加えれば完璧になりますし、DeepL翻訳にかける際にもより自然な翻訳が行えます。
ちなみにマクロ作成にはこちらを参考にさせていただいています。
秀丸エディタ マクロ言語(入門用)ヘルプ目次(Ver9.15対応版)
秀丸マクロの勉強(その13) | ある翻訳者の日常
正規表現(Ver9.15対応版)
下記がその内容です。
現時点で問題のある部分もあると思いますしここからさらに改善していく予定ですが、ひとまず貼り付けておきます。
setcompatiblemode 0x0F; begingroupundo; // ‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥ // 正規表現を利用しない場合でも noregular は明示しておいた方が良いらしい // https://ameblo.jp/kuma-2011/entry-10987753201.html // // 小文字と大文字を区別する場合は casesense を付与 // 置換後にハイライトしない nohighlight は省略 // ‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥ // 改行を全て半角スペースに置き換えて1行にする replaceallfast "\\n" , " " , regular; // YouTubeの字幕の角括弧による状況説明部分を削除 replaceallfast "[Music]" , "" , noregular, casesense; replaceallfast "[Applause]" , "" , noregular, casesense; // 略語 replaceallfast " freakin " , " freaking " , noregular, casesense; replaceallfast " aka " , " a.k.a. " , noregular, casesense; // 必ず大文字で表記したいものを大文字にする(専門用語や固有名詞が多め) // I replaceallfast " i " , " I " , noregular, casesense; replaceallfast " i'm " , " I'm " , noregular, casesense; replaceallfast " i've " , " I've " , noregular, casesense; replaceallfast " i'll " , " I'll " , noregular, casesense; replaceallfast " i'd " , " I'd " , noregular, casesense; // Countries, Languages replaceallfast " america " , " America " , noregular, casesense; replaceallfast " english " , " English " , noregular, casesense; replaceallfast " france " , " France " , noregular, casesense; replaceallfast " french " , " French " , noregular, casesense; replaceallfast " japan " , " Japan " , noregular, casesense; replaceallfast " japanese " , " Japanese " , noregular, casesense; // Survices replaceallfast " youtube " , " YouTube " , noregular, casesense; replaceallfast " twitter " , " Twitter " , noregular, casesense; replaceallfast " facebook " , " Facebook " , noregular, casesense; replaceallfast " instagram " , " Instagram " , noregular, casesense; replaceallfast " linkedin " , " LinkedIn " , noregular, casesense; replaceallfast " artstation " , " ArtStation " , noregular, casesense; replaceallfast " art station " , " ArtStation " , noregular, casesense; replaceallfast " cg hub " , " CGHub " , noregular, casesense; replaceallfast " kickstarter " , " Kickstarter " , noregular, casesense; replaceallfast " github " , " GitHub " , noregular, casesense; // Software Compnaies, Tools replaceallfast " autodesk " , " Autodesk " , noregular, casesense; replaceallfast " auto desk " , " Autodesk " , noregular, casesense; replaceallfast " maya " , " Maya " , noregular, casesense; replaceallfast " side effects " , " SideFX " , noregular, casesense; replaceallfast " houdini " , " Houdini " , noregular, casesense; replaceallfast " adobe " , " Adobe " , noregular, casesense; replaceallfast " photoshop " , " Photoshop " , noregular, casesense; replaceallfast " substance designer " , " Substance Designer " , noregular, casesense; replaceallfast " substance painter " , " Substance Painter " , noregular, casesense; replaceallfast " substance " , " Substance " , noregular, casesense; replaceallfast " blender " , " Blender " , noregular, casesense; replaceallfast " krita " , " Krita " , noregular, casesense; replaceallfast " embergen " , " EmberGen " , noregular, casesense; replaceallfast " popcorn effects " , " PopcornFX " , noregular, casesense; replaceallfast " popcorn effect " , " PopcornFX " , noregular, casesense; // Developers replaceallfast " naughty dog " , " Naughty Dog " , noregular, casesense; // Game Engine replaceallfast " unity " , " Unity " , noregular, casesense; replaceallfast " epic games " , " Epic Games " , noregular, casesense; replaceallfast " unreal engine " , " Unreal Engine " , noregular, casesense; replaceallfast " unreal " , " Unreal " , noregular, casesense; // General Terms replaceallfast " pcs " , " PCs " , noregular, casesense; replaceallfast " pc " , " PC " , noregular, casesense; replaceallfast " bootcamp " , " Boot Camp " , noregular, casesense; replaceallfast " (tv|url|faq|qr|ceo|cto|cfo|cc0) " , " \\(0,ToUpper) " , regular, casesense; replaceallfast " (real time|real-time) " , " Real-Time " , regular, casesense; // CG Terms replaceallfast " (cg|2d|3d|2k|4k|vr|ar|mr|gdc|vfx|ui|qa|rgb|fbx|gif|bmp|tga|png|jpeg|jpg|uv|dlc|ip) " , " \\(0,ToUpper) " , regular, casesense; replaceallfast " (visual effects|visual effect|[b|B] effects) " , " VFX " , regular, casesense; replaceallfast " (effects|effect) " , " FX " , regular, casesense; replaceallfast " triple a " , " AAA " , noregular, casesense; replaceallfast " normals " , " Normals " , noregular, casesense; replaceallfast " normal " , " Normal " , noregular, casesense; replaceallfast " shaders " , " Shaders " , noregular, casesense; replaceallfast " shader " , " Shader " , noregular, casesense; replaceallfast " (flip books|flipbooks|full ebooks) " , " Flipbooks " , regular, casesense; replaceallfast " (flip book|flipbook) " , " Flipbook " , regular, casesense; // 翻訳しやすいよう間投詞(interjection)を削除する replaceallfast " um " , " " , noregular, casesense; replaceallfast " uh " , " " , noregular, casesense; replaceallfast " yup " , " " , noregular, casesense; // you know は本当にその意味で使われている事もあるので _ で削除を明示 replaceallfast " you know " , "_ " , noregular, casesense; // セリフ部分を見つけやすくするための大文字の変換 replaceallfast " oh " , " Oh, " , noregular, casesense; replaceallfast " hey " , " Hey, " , noregular, casesense; replaceallfast " yeah " , " yeah, " , noregular, casesense; // ‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥ // ここからは改行を入れて読みやすく&翻訳しやすくするための置換 // 改行を入れてはいけない場合も出てくるが、改行するケースが多いものは改行優先で登録 // 処理順が大事になってくるので、影響が少ないものから先に処理 // so / and / but / because / then を対象にすると改行が入りまくるのでフォローが必要 // ‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥ // 比較的安全なものから // 改行して最初の1文字を大文字にする replaceallfast " (something like that) ([a-z|A-Z])" , " \\1\\.\\n\\(2,ToUpper)" , regular, casesense; // So, And, Then などが直前に来る可能性はある replaceallfast " okay " , "\\.\\nOkay, " , regular, casesense; replaceallfast " alright " , "\\.\\nAlright, " , regular, casesense; // 講演では文頭で使われることが多いので改行してしまう // また、最後にカンマを付けたい場合が多い replaceallfast " and on top of that " , "\\.\\nAnd, on top of that, " , regular, casesense; replaceallfast " but on top of that " , "\\.\\nBut, on top of that, " , regular, casesense; replaceallfast "(?<!And|But|,) on top of that " , "\\.\\nOn top of that, " , regular, casesense; replaceallfast " in addition " , "\\.\\nIn addition, " , regular, casesense; replaceallfast " so now " , "\\.\\nSo now, " , regular, casesense; replaceallfast "(?<!right|from|for|So|,) now " , "\\.\\nNow, " , regular, casesense; replaceallfast " (so) (first and foremost|first) " , "\\.\\nSo \\2, " , regular, casesense; replaceallfast " and first " , "\\.\\nAnd first, " , regular, casesense; replaceallfast "(?<!the|at|too|very|So|And|my|,) first " , "\\.\\nFirst, " , regular, casesense; replaceallfast " so next " , "\\.\\nSo next, " , regular, casesense; replaceallfast " and next " , "\\.\\nAnd next, " , regular, casesense; replaceallfast "(?<!the|So|And|,) next " , "\\.\\nNext, " , regular, casesense; replaceallfast " (but) (then again|then) " , "\\.\\nBut \\2, " , regular, casesense; replaceallfast " and then again " , "\\.\\nAnd then again, " , regular, casesense; replaceallfast "(?<!But|And|,) then again " , "\\.\\nThen again, " , regular, casesense; replaceallfast " and for example " , "\\.\\nAnd for example, " , regular, casesense; replaceallfast "(?<!And) for example " , "\\.\\nFor example, " , regular, casesense; replaceallfast " (as) (I|we) (said before|said) " , "\\.\\nAs \\2 \\3, " , regular, casesense; replaceallfast " (as) (I|we) (said) " , "\\.\\nAs \\2 \\3, " , regular, casesense; replaceallfast " (like) (I|we) (said) " , "\\.\\nLike \\2 \\3, " , regular, casesense; replaceallfast " so let's " , "\\.\\nSo let's " , regular, casesense; replaceallfast "(?<!So|and|then|,) let's " , "\\.\\nLet's " , regular, casesense; // カンマを挿入するため replaceallfast " and at the same time " , "\\.\\nAnd at the same time, " , regular, casesense; replaceallfast " at the same time " , "\\.\\nAt the same time, " , regular, casesense; replaceallfast " at that time " , "\\.\\nAt that time, " , regular, casesense; // この手の接続詞は講演だと文頭で使われることが多いので改行してしまう // また So が入ることがあるので両方登録 replaceallfast " so finally " , "\\.\\nSo finally, " , regular, casesense; replaceallfast "(?<!So|I|we|,) finally " , "\\.\\nFinally, " , regular, casesense; replaceallfast " so eventually " , "\\.\\nSo eventually, " , regular, casesense; replaceallfast "(?<!So|I|we|,) eventually " , "\\.\\nEventually, " , regular, casesense; // 文頭で使われることが多いので改行してしまう replaceallfast "(?<!and|but|because|think|if|,) there is " , "\\.\\nThere is " , regular, casesense; replaceallfast "(?<!and|but|because|think|if|,) there's " , "\\.\\nThere's " , regular, casesense; replaceallfast "(?<!and|but|because|think|if|,) there was " , "\\.\\nThere was " , regular, casesense; replaceallfast "(?<!and|but|because|think|if|,) there are " , "\\.\\nThere are " , regular, casesense; replaceallfast "(?<!and|but|because|think|if|,) there're " , "\\.\\nThere're " , regular, casesense; replaceallfast "(?<!and|but|because|think|if|,) there will be " , "\\.\\nThere will be " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|if|,) it is " , "\\.\\nIt is " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|if|,) it's " , "\\.\\nIt's " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|if|,) it was " , "\\.\\nIt was " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|if|,) it will " , "\\.\\nIt will " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|if|,) it'll " , "\\.\\nIt'll " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|which|what|when|where|how|that|then|,) we are " , "\\.\\nWe are " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|which|what|when|where|how|that|then|,) we're " , "\\.\\nWe're " , regular, casesense; // 関係代名詞の可能性もそこそこあるので注意 replaceallfast "(?<!so|and|but|because|think|like|,) that is " , "\\.\\nThat is " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|,) that's " , "\\.\\nThat's " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|,) that will " , "\\.\\nThat will " , regular, casesense; replaceallfast "(?<!so|and|but|because|think|like|,) that'll " , "\\.\\nThat'll " , regular, casesense; // so but and because は改行が大量に発生するので例外指定多め // 改行しないと非常に長い文章になりがちなので視認性のため改行優先 replaceallfast " and so on " , " and so on\\.\\n" , regular, casesense; replaceallfast " and so (?!on)" , "\\.\\nAnd so " , regular, casesense; // !!!!!!!!!! SO !!!!!!!!!! replaceallfast "(?<!is|was|are|were|And|But|,) so (?!much|many|cool)" , "\\.\\nSo " , regular, casesense; replaceallfast " and at the end of the day " , "\\.\\nAnd at the end of the day " , regular, casesense; replaceallfast " but at the end of the day " , "\\.\\nBut at the end of the day " , regular, casesense; replaceallfast "(?<!and|but|,) at the end of the day " , "\\.\\nAt the end of the day " , regular, casesense; // !!!!!!!!!! BUT !!!!!!!!!! replaceallfast " but (?!then|on|at)" , "\\.\\nBut " , regular, casesense; // !!!!!!!!!! AND !!!!!!!!!! replaceallfast " and then " , "\\.\\nAnd then " , regular, casesense; replaceallfast "(?<!more|over|pros|thousands|up|ups|,) and (?!so|at|I)" , "\\.\\nAnd " , regular, casesense; // !!!!!!!!!! BECAUSE !!!!!!!!!! replaceallfast " because (?!of)" , "\\.\\nBecause " , regular, casesense; // 改行しない方が良い場合が多いなら削除した方が良さそう replaceallfast " which (is|means) " , "\\.\\nWhich \\2 " , regular, casesense; replaceallfast " that means " , "\\.\\nThat means " , regular, casesense; replaceallfast "(?<!what|,) I mean " , "\\.\\nI mean, " , regular, casesense; // so but and 単体での改行の結果「So.」「But.」「And.」だけの行が生まれた時の対処 // 改行を無くして最初の1文字が大文字だったら小文字にする replaceallfast "(So)\\.\\n([A-Z])" , "So \\(2,ToLower)" , regular, casesense; replaceallfast "(But)\\.\\n([A-Z])" , "But \\(2,ToLower)" , regular, casesense; replaceallfast "(And)\\.\\n([A-Z])" , "And \\(2,ToLower)" , regular, casesense; endgroupundo 1; // ‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥ // 置換を見送ったもの // right -> , right? // replaceallfast "(?<!so|and) of course " , "\\.\\nOf course, " , regular, casesense; // replaceallfast "(yet)\\.\\n([A-Z])" , "yet \\(2,ToLower)" , regular, casesense; // ‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥‥