Home MongoDB自学笔记
Post
Cancel

MongoDB自学笔记

MongoDB 实战

[美] Kyle Banker

1. 主要特性

1.1 文档数据模型

类似于Json的格式来存储数据,不需要像关系型数据库那样在遇到一对多的情况下要写多张表。

1.2 即时查询

MongoDB实现类似即时查询的方法: db.posts.find({'tags': 'politics', 'vote_count':{'gt': 10}});

1.3 二级索引

B-tree

1.4 复制

通过副本集的拓扑结构实现复制功能。副本集有一个主节点和多个从节点组成,主节点可以读写、子节点只能读;若主节点故障,则自动从子节点找替代。

1.5 速度和持久性

速度指操作数据的时间多少;持久性指数据能维持的时间长短; MongoDB可以设置是否开启Journaling日志记录

1.6 数据库扩展

分片

1.7 命令行工具

  • 备份和恢复数据库:mongodump/mongorestore
  • 导入导出JSON/CSV/TSV数据: mongoexport/mongoimport
2. MongoDB Shell常用指令
  • db.users.save()/db.users.insert() 添加项
  • db.users.count() 计数
  • db.users.find() 查询
  • db.users.update({username:"smith"}, {$set: {country: "Canada"}}) 增加国家属性
  • db.users.update( {username: "smith"}, { $set: {favorits: { cities:["Chicago", "Cheyenne"], movies: ["Casablanca", "The Sting"] } }})
  • db.users.remove() 删除数据
  • db.users.drop() 删除所有索引
  • db.numbers.ensureIndex({num:1}) 为num键创建索引
  • 数据库基本操作
    • show dbs
    • show collections
    • db.stats()/db.numbers.stats() 获取数据库/集合底层信息
3. 使用MongoDB编写程序

ruby


MongoDB权威指南

[美]Kristina Chodorow & Michael Dirolf


MongoDB University

M101P: MongoDB for Developers

Chapter 1: Introduction

1.1 db.collection.insert(), db.collection.find().pretty() we can use pretty() to configure the cursor to display results in an easy-to-read format.

1.2 cursor.hasNext()/next() is used to list the whole collections by the cursor.

1.3 del(g['name']) is uesd to delete a item in a python dictionary.

1.4 curl -i locolhost:8080 can get a header and content back.

1.5 set cookie which is got by Get method

1
2
3
fruit = bottle.request.forms.get("fruit")
bottle.response.set_cookie("fruit", fruit)
bollle.redirect("/show_fruit") 

Chapter 3: Schema Design

3.1 foreign key constraints

Chapter 4: Performance

4.1 Pluggable storage engines inclues MMAP(default) and WiredTiger(2014)

  • WiredTiger: Document Level Concurrency, Compress on data

4.2 Index

  • create Index: db.students.createIndex({student_id:1});
  • delete Index: db.students.dropIndex({student_id:1});

create Index on tags

  • elemMatch: db.students.explain().find({'scores':{$elemMatch: {type:'exam'. score:{'$gt':99.8}}}}); it is used to meet the requirements in the search.

  • create unique index: db.stuff.createIndex({thing:1}, {unique:true});
  • delete the same item: db.stuff.remove({thing:'apple'}, {justOne: true});
  • sparse index: can not be used for sorting
  • create the Index in the background: db.students.createIndex({'scores.score':1},{background:true});
  • Explain: Verbosity
  • Covered Queries: search require and result are all index
    • all the fields in the query are part of an index;
    • all the fields returned in the results are in the same index
    • db.collection.find(query, projection): query(query requirement), projection(display return keys)
  • Geospatial Index:
    • ensure Index: db.stores.ensureIndex({location: 2d, type:1});
    • db.stores.find({location:{$near:[50,50]}});
  • Geospatial Spherical: Longitude, Latitude. Using GeoJSON
    • db.places.ensureIndex({'location': '2dsphere'});
    • db.places.find({location: {$near: {$geometry: {type: "Point", coordinates: [122,37]}, $masDistance: 2000}}});
    • db.stores.find({loc: {$near: {$geometry: {type: "Point", coordinates: [-130,39]}, $maxDistance: 1000000}}});
  • Text Indexes: db.sentences.ensureIndex({'words': 'text'});
    • db.sentences.find({$text:{$search:'dog tree obsidian'}}, {Score:{$meta: 'textScore'}}).sort({score:{$meta:'textScore'}}); The command will list the text by the similarity of the words.
    • db.students.dinf({student_id:{$gt: 500000}, class_id: 54}).sort({student_id:1}).hint({class_i:1}).explian("executionStats")
    • query order: equality-sort-range

4.3 Profile mode

  • 0: nolog
  • 1: log slow query
  • 2: log every query
  • command: mongod -dbpath /usr/local/var/mongodb --profile 1 --slowms 2
  • db.system.profile.find({ns:/test.foo/}).sort({ts:1}).pretty()
  • db.system.profile.find({millis:{$gt:1}}).sort({ts:1}).pretty()

Answer4.3

1
2
3
db.posts.ensureIndex({ date : -1});
db.posts.ensureIndex({ tags : 1,  date : -1});
db.posts.ensureIndex({ permalink : 1});

5. Aggregation Framework

5.1 count the sum:

1
2
3
4
5
6
7
8
db.products.aggregate([
	{ $group: 
		{
			"_id:"$category", 
			"num_products":{"$sum":1}
		}
	}
]); 5.2 aggregation pipeline
  • $project
  • $match
  • $group
  • $sort
  • $skip
  • $limit
  • $unwind
  • $out
  • $redata
  • $geonear

5.3 aggregation expressions aggregate([stage1, stage2],{options})

  • $sum: db.zips.aggregate([{"$group": {"_id":"$state","population":{$sum:"$pop"}}}]), sum population grouped as state
  • $avg: db.zips.aggregate([{$group: {"_id":"$state","average_pop":{$avg:"$pop"}}}])
  • $min
  • $max: db.zips.aggregate([{$group: {_id: "$state", pop: {$max:"$pop"}}}])
  • $push
  • $addtoSet: db.zips.aggregate([{$group: {"_id":"$city", "postal_codes":{$addToSet:"$_id"}}}])
  • $first/$last: must be used after sorts
  • $project

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    
      db.products.aggregate([
          {$project:
       	{
       		_id:0,
       		'maker': {$toLower:"$manufacturer"},
       		'details': {'category': "$category",
              				'price' : {"$multiply":["$price",10]}
              			},
       		'item':'$name'
       	}
          }
      ])
    

If you want to include a key exactly as it is named in the source document, you just write key:1, where key is the name of the key.db.zips.aggregate([{$project: {_id: 0, city: {$toLower: "$city"}, pop:1, state:1, zip:"$_id"}}])

  • $match: db.zips.aggregate([{$match: { pop: {$gt:100000}}}])
  • $text:
  • $sort: db.zips.aggregate([{ $sort:{state:1, city:1}}])
  • $unwind: unjoin the data
  • $out:(new in Mongodb2.6),rewrite the collection
  • {explain:true},{allowDiskUse:true}

5.4 Homework

5.4.1 db.posts.aggregate([{$unwind: "$comments"},{$group:{_id: "$comments.author",count : {$sum:1}}},{$sort:{count:-1}}]) This task need to count the comments’ author’s sum which is in the subJson. Need to be rejoin the whole Json.

5.4.2 the result is smaller than the correct answer but it is near. Do not figure out the reason.

1
2
3
4
5
db.zips.aggregate([
	{$match:{ $or: [{state:"CA"}, {state:"NY"}]}},
	{$match: {pop: {$gt:25000}}},
	{$group: {_id:0, avg: {$avg:"$pop"}}}
])

5.4.3

1
2
3
4
5
6
7
db.grades.aggregate([
	{ $unwind: "$scores" },
	{ $match: { $or: [ {"scores.type": "homework"}, {"scores.type":"exam"} ] } },
	{ $group: { _id: { 'student_id': "$student_id", 'class_id': "$class_id" }, avg: { $avg: "$scores.score" } } },
	{ $group: { _id: "$_id.class_id", class_avg: { $avg: "$avg" } } },
	{ $sort: { 'class_avg': -1 } }
])

5.4.4

1
2
3
4
5
6
db.zips.aggregate([
{ $project: { _id: 0, city: 1, pop: 1 } },
{ $match: { city: /^(B|D|O|G|N|M).*/ } },
{ $group: { _id: null, pop: { $sum: "$pop" } } },
{ $sort: { city: 1} }
])

5.5 the different between $group and $project

6. Application Engineering

6.1 Write Concern:

  • w: whether the application wait for the server’s acknowledge during the response. 1(wait)/0(not)
  • j: whether the application wait for the server’s response on writing the disk. Ture(wait)/False(not)

6.2 Replica Set

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# script to create 3 replica sets: sudo bash < ./create_replica_set.sh
#!/usr/bin/env bash
mkdir -p /data/rs1 /data/rs2 /data/rs3
mon	god --replSet m101 --logpath "1.log" --dbpath /data/rs1 --port 27017 --oplogSize 	64 --fork --smallfiles
mongod --replSet m101 --logpath "2.log" --dbpath /data/rs2 --port 27018 --oplogSize 	64 --smallfiles --fork
mongod --replSet m101 --logpath "3.log" --dbpath /data/rs3 --port 27019 --oplogSize 	64 --smallfiles --fork

# init the replica set
conf	ig = { _id: "m101", members:[
          { _id : 0, host : "localhost:27017"},
          { _id : 1, host : "localhost:27018"},
          { _id : 2, host : "localhost:27019"} ]
};
rs.initiate(config);
rs.status();
  • Reading from a secondary set: rs.slaveOk()(rs means replica set)
  • connect to different database: mongo --port 27017/8/9
  • secondaries use oplog to synchronize the primary dataset.
  • shut down the dataset: rs.stepDown()

6.3 Failover and Rollback

This post is licensed under CC BY 4.0 by the author.

安全笔记概论

安全笔记--Web攻击技术