每周源代码41- 检索代码,共享代码以及阅读代码(含注释)


[原文发表地址] The Weekly Source Code 41 – Searching Code, Sharing Code, and Reading Code (and Comments)

[原文发表时间] 2009-04-28 15:41

我十分提倡大家尽可能多地阅读各种源代码,因为如果你要想成为一名更好的程序员,阅读和编写代码就要齐头并进才行。这也是《每周源代码》的意义所在——勤读代码,有助于成为更高水准的程序开发者。

在开放式源码项目中勤读代码是一种很好的学习方法,尤其是当此项目趋于成熟并且已获得成功,或者你已经对工作在开发团队中的成员很尊敬了。至少,通过检索和共享代码,你能找到一些代码片段。

检索代码

在“代码搜索引擎”领域里,主要有三个玩家,也是我一直在研究的对象,当然也可能还有别人。

Koders.com (Phil Haack曾在此工作)

Krugle – MSDN使用的就是Krugle来支持MSDN Code Search Preview ,用以搜索MSDN Library, MSDN Code Gallery和CodePlex.

(注意,由于Krugle MSDN 代码搜索只是预览,时常处于开发状态中,因此不是很稳定。这就是为什么总会跳出登陆的对话框。你只需稍侯片刻就行。)

Google Code Search– 如果你的网站上有很多代码,那么通过扩展网站地图去包括CodeSearch扩展来使其包含代码搜索扩展,能起到简化代码的效果。如果允许,你也可以将它们点向你的Subversion回收。

Codase – 永生不死,且长生不老。虽然仍在运营中,但他们的主页已有四年未进行更新了。发生什么了?很诡异。我没把他们列入玩家,但就我看来,他们是先驱。

大家都知道,99%的网络搜索是自由文本搜索。每隔一段时间,我就会使用一项先进技术,比如用"filetype:ppt",或者是数字范围,比如"$1500..$3000"进行搜索。但说真的,这种概率低于1%。如果你说你经常这样,我必将提出质疑。自由文本搜索基本能保证其准确率,除非你搜索的是同音异义词或者不够具体。

自由文本的确很不错,但是我还是喜欢代码搜索引擎这种概念,而且我确实会用到。当然,我也不确定除了“特定自由文本搜索”之外是否还需要其他什么。我个人认为搜索查询时越具体,越特定,就越能让你从大堆结果中滤出你所需的那个结果。

不过,有时候你搜索代码,也会偶尔需要先进技术的帮忙。

以下是一些实例:

       Google Code Search:

                o BTree lang:C#

                o license:bsd xdocument lang:c#

      MSDN Code Search via Krugle

                o btree filename:"*.h"

                o Poop findin:comment

      Koders

                o cdef:md5 (a class called MD5)

如果你想搜索类似于 MD5 hash 和BTree的具体应用,这些搜索引擎就非常有用了,虽然他们不能保证质量。

共享代码

通过你的博客..

我开始用SyntaxHighlighter 在我的博客上分享我所有的代码片段,我甚至懊恼为什么没早点这么做。它最大的优点就是我所有的代码片段都在<pre>标签内,这意味着它们容易索引且标记很清晰。SyntaxHighlighter是通过JavaScript添加的客户端。我用Windows Live Writer的PreCode插件,在博客上写代码,末尾标注:是我编的。很快这就成了一种特别的内嵌式代码发布方法。我将Nerddinner PDF转换成了HTML,还说服ScottGu(没有问就直接为他做了)一起使用SyntaxHighlighter

通过网页或是IM

大家都知道在IM窗口上粘帖是件很痛苦的事情。我用的是代码粘帖服务。

我很喜欢用Josh Goebel’s Pastie来共享代码,尽管他不支持C#和VB。于是我慢慢转战使用Gist.Github.com

以下是我认为最好的代码片段共享网站:

· Pastie.org– 是Rubyists 的最爱并且很原始。不能再简单了。

· Pastebin.com – 是 IRC 用户的最爱。支持C#语言和其他多种语言。此外还支持当天或当月的代码“过期”。对邮件和IM来说十分便捷。

· Gist.GitHub.com – 起名冠军,因为Gist读音和Jist一样,意味着“本质”,是表达的最佳方式 。表达,就是gist。支持几乎各种语言,同样有“私密”选项。他还支持版本控制,且十分独特。这个非常有用因为Github是一个“社交源代码控制系统”。下面这个截屏演示了这个理念为什么能有进一步的进展

我和Jeff关于代码共享的交谈中也有一两个十分有趣的对话, 。我们来看看有什么结果。

代码注释

有点跑题,但还是很有趣的。我觉得每个开发者都应该有一个博客,或者至少得写些什么。那些没有的,常通过代码注释来表达自己的想法。

在StackOverflow有篇很棒的东西,内容是寻找“你所见过的最棒的源代码注释”。最终,毫无疑问地,成了列举你所见到的最烂注释。因为编程人员就是这么工作的,不是吗?最棒的==

最烂的。很像Daily WTF

如果你使用代码搜索引擎来找一些本不会出现在代码中的东西,你一定会找到一些好玩的材料。这个家伙在Linux Source Code中查找宣誓词并把它们全部标了出来

还有其他一些含糊的例子

Koder’s上找Poop

; Poor-man’s Object-Oriented Programming
; or
; POOP
(module POOP
(import Utility)

Krugle上找Mind-Numbing

// mind numbing: let caller use sane calling convention (as per javadoc, 3 params),
439 // OR the 2.0 calling convention (no ptions) – we really love backward compat, don’t we?

Google Search上找Hate

 # God, I hate DTDs.  I really do.  Why this idiot standard still
 # plagues us is beyond me.

Google Search上找Horrible

     case 'H':
           horrible++;
      break;

Koders上找God Himself

不能说是一个注释,但还是很有趣的

       (c.query_gender().equals("male") ? "He" : (c.query_gender().equals("female") ? "She" : "It"))
                 + " is " +
                 ((c.query_level() == client.WIZ_GOD) ?
                       "the Almighty God himself\n\rBeware of his wrath if you don't follow his laws!" :
                ((c.query_level() > client.MORTAL) ? "a powerful immortal" : "a puny mortal")))+ "\n\r"

Google Search上找profoundly bad

       if isinstance(real_child, SilentMock):
          raise TypeError("Replacing a mock with another mock is a profoundly bad idea.\n" +
    "Try re-using mock \"%s\" instead" % (name,))

Google Search上找Pure Evil

       my $db = delete $access->{db};
                 # This is pure evil.
          $db->DESTROY;

Google Search上找只在Ruby文件中的Poop

       "Stimpy-drool",
       "poopy",
       "poop",
"craptacular carpet droppings",

我确信如果你去搜,你会在注释里找到很多好东西,比这些还好玩。比如Greatest Code Comment Ever (第107行) Cam Soper的贴士:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

uint32 sign=[fh readUInt32BE];

uint32 marker=[fh readUInt32BE];

uint32 chunklen=[fh readUInt32BE];

off_t nextchunk=[fh offsetInFile]+((chunklen+3)&~3);

// At this point, I'd like to take a moment to speak to you about the Adobe PSD format.

// PSD is not a good format. PSD is not even a bad format. Calling it such would be an

// insult to other bad formats, such as PCX or JPEG. No, PSD is an abysmal format. Having

// worked on this code for several weeks now, my hate for PSD has grown to a raging fire

// that burns with the fierce passion of a million suns.

// If there are two different ways of doing something, PSD will do both, in different

// places. It will then make up three more ways no sane human would think of, and do those

// too. PSD makes inconsistency an art form. Why, for instance, did it suddenly decide

// that *these* particular chunks should be aligned to four bytes, and that this alignement

// should *not* be included in the size? Other chunks in other places are either unaligned,

// or aligned with the alignment included in the size. Here, though, it is not included.

// Either one of these three behaviours would be fine. A sane format would pick one. PSD,

// of course, uses all three, and more.

// Trying to get data out of a PSD file is like trying to find something in the attic of

// your eccentric old uncle who died in a freak freshwater shark attack on his 58th

// birthday. That last detail may not be important for the purposes of the simile, but

// at this point I am spending a lot of time imagining amusing fates for the people

// responsible for this Rube Goldberg of a file format.

// Earlier, I tried to get a hold of the latest specs for the PSD file format. To do this,

// I had to apply to them for permission to apply to them to have them consider sending

// me this sacred tome. This would have involved faxing them a copy of some document or

// other, probably signed in blood. I can only imagine that they make this process so

// difficult because they are intensely ashamed of having created this abomination. I

// was naturally not gullible enough to go through with this procedure, but if I had done

// so, I would have printed out every single page of the spec, and set them all on fire.

// Were it within my power, I would gather every single copy of those specs, and launch

// them on a spaceship directly into the sun.

//

// PSD is not my favourite file format.

享受你的搜索之旅,记得勤读代码!


Comments (0)

Skip to main content