Category Archives: Hack Space

仔细想想还是 Dockerized 吧!

The AI Lab of my mentor was running by me for quite some months. And now it's about time to hand over the docs of the internal server to graduates. Though one of which tends to lose internet connection from time to time due to its location. However, I heart that it had been moved back to university in the middle of July.

And originally, I use Microsoft Word to keep all the records and information of almost everything, but it obviously would cause some issues.

For example, one's docs version may vary from another. Yes, I've thought about to use the cloud storage with version control even. The problem is that we cannot afford the expense of cloud drive. And we could not find someone who's willing to take the charge of reimbursement. The bills have already piled up in my mentor's desk.

Besides that, to use file as docs will inevitably introduce the ugly naming, such as docs-20190807, docs-20190607 or whatever. And it would be totally disaster if to use git for version control. Despite of the unreadable commits, the filename needs to be the same, which extremely likely to be ignored to update from the git repo for some people.

Luckily, there's one instance on AliCloud (Although personally I don't really like AliCloud, but that's another story, let's save it for next time). And lots of packages that can generate static HTML from markdown have been developed these years around.

It would be easy for everyone to access docs online and because the markdown file is pure text, we can have a very good and most important, readable track of changes with git.

The final decision is to use VuePress as the static HTML generator. And to ensure a simple installation process, dockerization is the best shot at the moment. Furthermore, basic HTTP auth is needed to keep unwanted visitors out, leaving the docs only accessible to the lab.

For your convenience, this project is located at my GitHub, https://github.com/BlueCocoa/docs. It's fully prepared and dockerized with docker-compose support.

Continue reading 仔细想想还是 Dockerized 吧!

Just for fun: Compile time fibonacci

所以继续摸个鱼,用 C++ 模版编程写个 fibonacci 计算

当然一开始只是顺手玩玩了,因为 C++ 这里的模版匹配其实挺像函数式编程的(似乎还有人曾用 C++ 的模版匹配来做过 SAT solver 的样子)

先把 C++ 编译时的放在下面,一会儿再补一个正经函数式编程的代码,然后再加一个更优雅的函数式实现(Elixir),最后用 Python 和 C++ 再模仿一下函数式的吧~

Continue reading Just for fun: Compile time fibonacci

Have some fun with C++ template programming and compile time string obfuscation

嗯,摸鱼的时候看到了一个 C++ 编译时混淆字符串的实现,urShadow/StringObfuscator. (然后还顺便又玩了一下 C++ 模版编程)

怎么说呢,这样的通过C++模版来现实的编译时混淆其实特征还是相对比较明显的,另有一种也是在编译的时候去混淆,但是则是由编译器/编译器插件来现实的。

至于说性能的话,直观来说对于绝大多数日常应用,两种方式相比不做混淆来讲,也没有可观察到的区别。不过我也没去做benchmark,有兴趣的倒是可以一试。

urShadow/StringObfuscator 使用上比较简单,但相比编译器插件的方式,还是会需要对代码做出一定的修改。

#include <iostream>
#include "str_obfuscator.hpp"

int main(int argc, const char * argv[]) {
    std::cout << cryptor::create("Hello, World!").decrypt() << std::endl;
    return 0;
}

总的来说实现上很简单,很直接,利用 C++ 模版参数取到要混淆的字符串的长度 S与其本体 str。

Continue reading Have some fun with C++ template programming and compile time string obfuscation

多用户 Docker 环境下 PyPi 源按需加速

这一篇算是接在上一篇Build a super fast on demand local PyPi mirror的后面吧~

这里会以 docker-compose 的方式为例子,详细写一下~不使用docker-compose的话,则也仅仅需要手动指定 pypicache 与需要这个服务的 container 到同一个 docker 网络中,这样就可以不用去找 pypicache 的 IP 地址,对最终用户透明化,不用增加额外的 pip 安装参数,即可轻松享受本地高速缓存,特别是对于大一点的文件效果更明显~

jupyterhub-docker
jupyterhub-docker

Continue reading 多用户 Docker 环境下 PyPi 源按需加速

Build a super fast on demand local PyPi mirror

  • 当公司/局域网里有多人都使用 Python 开发,并且几乎都会用到 pip 来部署环境时,虽然已经有各种镜像源了,但是下载仍受限于与外网的宽带速度,并且同样的包可能被多人下载了多次,在包较大时,重复花的时间并不值
  • 当你使用 Docker 来构建不同的 Python 应用/环境时,在测试 Dockerfile 时可能需要不断的删掉之前 build 的版本,从头开始 build 时,pip 下载与上面面临同样的问题——重复消耗不必要的时间

其一解决方案是公司/局域网内部搞一个 PyPi 的镜像源,实际上维护一个完整的镜像源相当麻烦,占用的储存空间太大,在公司/局域网的情况下,大家开发的东西、使用的技术栈相对比较固定,这就导致完整的镜像源里会有很多包其实几乎没人用。

其二的解决方案可以是预先构建好一个或多个 Docker 镜像,其中包含大家都会用到的包,剩余的一些包则在使用时才被少数需要的人安装。这种方案的缺点则是目前 Docker 服务 + 多用户方案在重启之后会丢掉已经配置过的环境,重启之后依旧需要从镜像源下载包。

那么这里相对一劳永逸的方案则是搭建一个本地的按需下载的 PyPi 镜像源,其原理则是在镜像源与公司/局域网内增加了一个高速缓存,并且由于 PyPi 已经提交分发的whl或者tar.gz是不会变的,因此不用顾虑缓存时间的设置。

最后就像这样~ 182KB/s VS. 36.4MB/s
(cache server为千兆有线链接,MacBook为802.11 AC,测试时链接速度585Mbps)

It's apparently super fast after being cached!
It's apparently super fast after being cached!

Continue reading Build a super fast on demand local PyPi mirror

A brief tutorial on setup an AI lab server for a small team

这个是在之前导师的实验室积累的一些东西,使用场景的话,是适用于2-8人左右的小团队吧,当时有两台机器,一台是放在学校机房的服务器,CPU没注意是什么,印象中是64G内存,4块P20,貌似24G显存?;另一台机器则放在办公室,主要配置的话,一颗AMD Ryzen 2700X,64G内存,再附加两块1080ti 11G,经费肯定是还做不到一人分一块GPU,部分模型的大小也不需要完全独占一块GPU。但是构建一个小型团队使用的AI Lab服务器是没问题了。

当时搭建的AI Lab服务器的主要架构如下

AI Lab Platform Architecture
AI Lab Platform Architecture

系统方面选择了Ubuntu 18.04 LTS,简单方便,毕竟是做AI不是做OS,没有任何必要引入其他方面复杂的操作。然后在这之上则是系统层面的GPU驱动,当时对应的版本为396.26,目前已经有400版本号的驱动了。接下来就是与docker对接的nvidia的runc,由这个runc去给docker内的GPU提供支持。随后当时则是使用了支持多用户的JupyterHub,当然也可以通过分配多个账号解决,这一部分和之后的部分解决方案就很多了。

Continue reading A brief tutorial on setup an AI lab server for a small team

SIGGRAPH 2018 「Semantic Soft Segmentation」复现笔记

santa
santa

SIGGRAPH 2018这篇论文主要分为两大部分,第一部分是 DeepLab v2+ResNet101 训练出来用于获取输入图像的 high-level feature 网络,对于输入的 $I = (h, w, 3)$ 图像,为每一个像素点生成一个 128 维的特征向量,因此该网络的输出是 $F = (h, w, 128)$

接下来,使用 $F$ 和 $I$ 进行引导滤波,$F_{filtered} = imguidedfilter(F, I, 10, 0.01)$,在引导滤波这个地方,OpenCV中的 cv2.ximgproc.guided_filter 与原作者使用的 matlab 中实现的 imguidedfilter 有不小区别,于是我对着 matlab 中 imguidedfilter 的实现重写了一下 OpenCV 版的,bluecocoa/imguidedfilter-opencv

在计算完了 $F_{filtered}$ 之后,利用 PCA 将它压缩到 3 维,$F_{PCA} = PCA(F_{filtered}, 3)$,如下图。

santa PCA
santa PCA

Continue reading SIGGRAPH 2018 「Semantic Soft Segmentation」复现笔记

Monitoring Ivar Changes in Objective-C

As we've mentioned in the last post, Protection against Message Forward in Objective-C, there're at least two tools for tracing the calling sequence of the methods,

However, they just cannot handle it in the scene below,

@interface ProtectedClass : NSObject {
@public
    NSString * _password;
}
@property (nonatomic, getter=password, setter=setPassword:) NSString * password;
@end
/// ...omited...
    ProtectedClass * obj = [[ProtectedClass alloc] init];
    obj->_password = @"喵咕咪~"; // directly access, undectectable in BigBang or ANYMethodLog
    [obj setPassword:@"喵"]; // BigBang or ANYMethodLog dectectable
/// ...omited...

Because it's not necessarily to call getter or setter in Objective-C when access or change an ivar. Since Objective-C is just a superset of C, so the object (or instance) in Objective-C acts pretty much like the struct in C. You can directly access its member if you have the memory address. Let's check out what happens when compiling.

Here is our code, written in Objective-C, and it's probably quite often to be seen in your projects.

Objective-C code
Objective-C code

Continue reading Monitoring Ivar Changes in Objective-C